Hi Lawrence, you can reproduce the problem by runing test_linear.py at bitbucket.org/colinjcotter/slicemodels/branch/periodic_parallel. With 1 and 2 cores, it gives different results. The code Colin is describing with line numbers is an old-version of slicemodels.py, which is found here: https://gist.github.com/anonymous/c308131558ff54fec9c4 In this version I added this term - (-dt*div(w)*pbar)*dx in line 270, which was previously added to the RHS in line 139. With this change, the parallelization problem seemed to be fixed and I got the same results with 1 and 2 cores, but but it makes no sense to us. All the best, Hiroe 2014-11-18 20:46 GMT+00:00 Mitchell, Lawrence <lawrence.mitchell@imperial.ac.uk>:
Hello,
On 18 Nov 2014, at 18:57, Colin Cotter <colin.cotter@imperial.ac.uk> wrote:
Dear all, We have a rather weird parallel bug, and Hiroe and I would appreciate some suggestions. The relevant code is in bitbucket.org/colinjcotter/slicemodels/branch/periodic_parallel revision 622fb24. We were observing different results on 1 and 2 cores, which Hiroe fixed by moving some code, but it makes no sense to us! In line 270, you can see an extra term, - (-dt*div(w)*pbar)*dx, which was previously added to the RHS in line 139 so, this changes implementation but not the algorithm either, this extra term is solved into a contribution of self.ures in L142, in which case it pops up as part of self.ures in l270, or we write it explicitly in the form there in general, we don't wish to write it explicitly in the form though, and I want to understand what is going on!
Some further investigation showed that the problem seems to occur in the pressure solve on line 393, resulting in a nonsense value of self.Deltap occurring. This is just inverting a DG mass system so it ought to be straightforward, and the inputs are (approximately) the same.
Do let me know if you have any ideas,
Can you describe what I need to do to reproduce the problem? I have looked at the repository, but don't know where to begin.
Lawrence
On 18 Nov 2014, at 21:35, Hiroe Yamazaki <h.yamazaki@imperial.ac.uk> wrote:
Hi Lawrence,
you can reproduce the problem by runing test_linear.py at bitbucket.org/colinjcotter/slicemodels/branch/periodic_parallel. With 1 and 2 cores, it gives different results.
The code Colin is describing with line numbers is an old-version of slicemodels.py, which is found here:
https://gist.github.com/anonymous/c308131558ff54fec9c4
In this version I added this term
- (-dt*div(w)*pbar)*dx
in line 270, which was previously added to the RHS in line 139. With this change, the parallelization problem seemed to be fixed and I got the same results with 1 and 2 cores, but but it makes no sense to us.
I cannot reproduce this problem with the current HEAD of that branch (6ce1ffb02). The output vtus look the same in the eyeball norm, and once I fix the printing of the deltap (Colin, in parallel you can't look at the sum of the local entries. Do function.dat.norm to get the l2 norm or norm(function) for the L2 norm). In particular, if I specify jacobi PCs for /all/ solvers I get basically identical values in serial and parallel (up to solver tolerance) So I don't really know where to look. Can you give a precise sequence of steps for me to follow to reproduce the problem and observe it? Lawrence
participants (2)
- 
                
                Hiroe Yamazaki
- 
                
                Lawrence Mitchell