New subject: Firedrake on supercomputers

7 Aug 2014

      Hi Tuomas

Comments from Lawrence are correct.

For what concerns optimizing assembly through COFFEE (which is a tool
integrated with PyOP2 that does the optimization of local element
matrices/vectors evaluations) it is actually true that in general:
parameters["coffee"]["licm"] = True
parameters["coffee"]["ap"] = True
are the key parameters.
However, if your function spaces have relatively high polynomial order *or *if
your form has a lot of coefficients, then you may be interested in other
optimizations as well. Please, let me know if this is the case, in which
case I'll try to be more precise.

Said that, our (short-term) goal is to let you (users) abstract completely
from playing with these sort of "low level parameters". Just to let you
know, we are currently working on an autotuning system that, once in place,
should allow you to get significant better run-times while avoiding the
need for setting/trying manually individual optimizations. We are not far
from that, so we'll keep you posted (should take order of days).

As for the MIC, we have some performance numbers concerning assembly "in
isolation", but we (I) have never brought the whole toolchain on the MIC -
that is, we have just used it as an accelerator. In particular, code is
*not* specifically optimized for these kind of architectures, especially if
you are running at low polynomial order (in practice, we are not taking
full advantage of the large vector lanes).

As for MKL in assembly kernels, I'm running experiments in these days to
quantify how much faster can we go by transforming assembly code into a
sequence of BLAS calls. Problem is that at low polynomial order the
involved matrices are kind of small. What I have seen so far (but these are
early experiments) is that if you are (I'm being vague here, but just to
give you an idea) using like polynomial order 3 or 4 and if you are form as
some coefficients in it, then turning to BLAS may be a (big) win. I hope
I'll be able to report more (convincing) results, and to be more precise,
in the next few days.

Hope this helps

-- Fabio

2014-08-07 7:31 GMT+01:00 Tuomas Karna <tuomas.karna@gmail.com>:
...
Hi all,
I'm running my code on TACC Stampede and I have a couple of questions:
How can I setup Firedrake/PyOP2 for the target machine, for example to
use Intel compilers, target architecture (sandybridge) instruction set,
or Intel MKL libraries?
Until now I've only used MPI. What do I need to do to run with openmp or
cuda for instance? I've got the PyOP2 dependencies in place, as they are
listed on the Firedrake website.
Also, does anyone have experience on using Intel MIC coprocessors with
Firedrake?
Cheers,
Tuomas
_______________________________________________
firedrake mailing list
firedrake@imperial.ac.uk
https://mailman.ic.ac.uk/mailman/listinfo/firedrake

Re: [firedrake] Firedrake on supercomputers

Fabio Luporini

Tuomas Karna

tags

participants (2)