Re: [firedrake] cached kernels

9 Nov 2015

      -----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 08/11/15 10:06, Eike Mueller wrote:
...
Hi Lawrence (copied to firedrake, since overheads from loading 
libraries might be a general concern),
I tried it on ARCHER and adding caching for the kernels does not
make any difference. The LU solve performance at lowest order is
poor, but an individual call takes actually more time (~0.01s) than
the operator application (~0.001s), so I would have thought the
overheads are actually relatively smaller for the LU solve. For the
operator application the reported BW is excellent, but for the LU
solve it is very poor. At higher order both BWs are good, here the
data volume is larger, but the time for one LU solve call is still
~0.01s. Maybe in this case any overhead that shows up at lowest
order is hidden.
Could there be an overhead from loading the LAPACK library, which
is required for the LU solve?
This isn't how dynamic loading works.  The first time you load the
.so, in the warmup phase, the symbol is resolved, and the trampoline
is replaced by a direct call.

I have effectively no idea what's going on.  Does the LU solve take
this long on this much data if you just call it from C?

IOW, I think it's not "our" fault, unless somehow you're managing to
get a recompile or similar every time you call _lu_solve.

Lawrence
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)

iQEcBAEBAgAGBQJWQGtBAAoJECOc1kQ8PEYv+PUH/0o9la78TbSn7UTWe9anzMwC
o4GkJ0lfbwvmZ6PWI+fPzrsH4lnR1AOiWSvG/BBNIW4SQvMhx50otImyeQePZ+9s
7uZqOcKdyvsRncFDSpdlND5eDO4+o9QVfINrmw4W9eXe9WsIUPHAWNsINkvyqnfX
GlW8dRynKoIPqs7ZR3DfNHUF0RRtbY3z4Zo/jjeDzGXnvdXVagmhLRG17UQ2WB8H
p8qSFBTNgnSKS1kKvUNlaR0cL2agTuoPSAY6ITnb7hJzBxSGXrWNcj8dFuune6hi
wWkSxS5Y2Lgio+X/Jw36zMUdBTXLzwWSfjBhiYpHgch9zGXAskNSMdZM8DbeA3I=
=7KZc
-----END PGP SIGNATURE-----

Re: [firedrake] cached kernels

Lawrence Mitchell