Thanks for your answer, Given what you say, maybe we should focus on what happens in the point evaluation function before going back to valgrind Le 09/06/16 à 14:15, Lawrence Mitchell a écrit :
On 9 Jun 2016, at 14:51, Nicolas Barral <n.barral@imperial.ac.uk> wrote:
I am still trying to find the bug I mentioned on IRC two weeks ago (point evaluation raising a not in domain error for points in the domain), and I have been totally unsuccessful so I need your help.
You suggested using valgrind to find a memory issue, which I tried with the regular build of Firedrake (not in debug mode). Even with the suppression file suggested on Python's website, I get a many thousand line log, most errors being unusable because the debug symbols were not exported.
Do any of the lines look like they are memory errors in likely candidates. That is, either code compiled by us (library has some kind of md5 hash in its name) or else inside libspatialindex?
I'm absolutely not an expert in valgrind, so I don't know, without the debug symbols it's hard to understand where erros come from. On linux they all seem to be Python errors. On my mac I had some more errors, but I wouldn't trust them.
So I thought I should try to compile python with the adequate debug options, and use this python with Firedrake. But how do I do that ? I tried to run firedrake-install with a PATH pointing to my custom build of Python (Miklos's suggestion), but it fails when checking/installing virtualenv (it keeps stopping at "Requirement already satisfied Virtual env installed. Please run firedrake-install again."). It sees the virtualenv from the standard python, which might be a problem ? So I'm stuck at that point.
I do not know how to install multiple versions of python and get them to pick up the correct virtualenv and similar. I would have thought that your hand-installed python would not find the globally installed virtualenv stuff, but maybe not. Can you just make a virtualenv "by hand". We effectively just run "python -c 'import virtualenv'" and use that. So if that doesn't work that's somewhere to start.
Okay I can try that.
Normally if you're on linux you can just install the package that adds debug symbols for python (on ubuntu this is called python-debug I think).
Miklós already suggested that, unfortunately I can't install packages on the linux on which I'm working, and it's not an option on mac.
My plan was then to compile the modules/dependencies in debug mode (notably the libspatialindex). Would that be really helpful ? And how do I do that since I can't just run configure/make in the right directory ? firedrake-install runs configure/make for you. You can determine how to installed the shared libraries in debug mode and add that as an option to firedrake-install, we will gladly accept patches for this.
Okay, I will at least try and come back with more questions.
Another idea came to me, and I searched for the cells of the mesh surrounding the points "not in domain" and computed their barycentric coordinates in the corresponding cell. It turns out that the point is always on a vertex or an edge (one or two barycentric coordinates ~ 1e-16). Could there be a bug in libspatialindex or the at function in these cases?
So what you're saying is that the point is right on the boundary between two cells? Apparently, yes.
I guess there are two ways the point location can fail. Either the point is not found in any bounding box. Or, the point is found by libspatialindex in a bounding box. But that bucket does not actually contain the cell in question, so then the linear search for point location may fail due to floating point rounding. Which of these two are occurring? It seems very likely we're in the second case, (and floating point rounding could also explain why the behaviour changes when using mac or linux ?)
All of this code is compiled by us and is pretty straight line C, so you can just compile with debugging and step through it in the failing case and figure out what's going on.
I'm going to need more help for that... where is this C code ? (I tried to read the code of the at function, I'm not sure I understand what is called in make_c_evaluate ...) -- Nicolas -- Nicolas Barral Dept. of Earth Science and Engineering Imperial College London Royal School of Mines - Office 4.88 London SW7 2AZ