Dear firedrakers, I finally got to the bottom of this. It turns out that I had set parameters[“COFFEE”][“O2”]= False around the parloop which executes the kernel, but not around the bit of code which actually compiles the UFL form. Stupid mistake… So this caused a horrible segfault, since the kernel expected data of size A[8][20], but it was passed A[6][18]. It was quite tricky to find this kind of bug, though, since it only segfaults without much information. I finally managed to run interactively on ARCHER, inspected the core dump with gdb and looked at the generated c-code. I was wondering whether this kind of issue can be detected when you generate the wrapper code? Don’t you know both the signature of the function and the passed data at this point? Or has the COFFEE optimisation issue been resolved? I pulled the latest version of COFFEE, though. Thanks, Eike -- Dr Eike Hermann Mueller Research Associate (PostDoc) Department of Mathematical Sciences University of Bath Bath BA2 7AY, United Kingdom +44 1225 38 5803 e.mueller@bath.ac.uk http://people.bath.ac.uk/em459/
On 5 Feb 2015, at 16:28, Eike Mueller <E.Mueller@bath.ac.uk> wrote:
Thanks, I tried the atp and also inspected the core dump with
gdb python core
There is no backtrace in the core dump, and ATP does not generate any information either.
I still only get the segfault in my output file. I hope I can localise this a bit more tomorrow.
Eike
On 05/02/15 15:23, Patrick Farrell wrote:
On 05/02/15 14:37, Lawrence Mitchell wrote:
Number of cells on finest grid = 5120 dx = 364.458 km, dt = 2429.717 s _pmiu_daemon(SIGCHLD): [NID 01160] [c6-0c0s2n0] [Thu Feb 5 14:22:05 2015] PE RANK 11 exit signal Segmentation fault [NID 01160] 2015-02-05 14:22:05 Apid 12880356: initiated application termination Application 12880356 exit codes: 139 Application 12880356 resources: utime ~31s, stime ~19s, Rss ~318352, inblocks ~104428, outblocks ~788 Finished atThu Feb 5 14:22:10 GMT 2015
Hmm, that's not a lot of useful information.
Try running again with
module load atp export ATP_ENABLED=1
Sometimes it gives useful information about abnormal terminations; http://www.archer.ac.uk/documentation/best-practice-guide/debug.php
Patrick
_______________________________________________ firedrake mailing list firedrake@imperial.ac.uk https://mailman.ic.ac.uk/mailman/listinfo/firedrake
_______________________________________________ firedrake mailing list firedrake@imperial.ac.uk https://mailman.ic.ac.uk/mailman/listinfo/firedrake