Re: [firedrake] Issues with LMA of vertical derivative on extruded meshes

Fabio Luporini

5 Jan 2015 5 Jan '15

12:37 p.m.

I recently (two-three weeks ago) fixed a bug concerning padding, packing and unpacking. Are you sure coffee is up-to-date? If so, I'll have a look at this in the next few days (will be back in London tomorrow) and be back to you once fixed. Thanks! -- Fabio 2015-01-05 12:45 GMT+01:00 Mitchell, Lawrence < lawrence.mitchell@imperial.ac.uk>:

...

On 22 Dec 2014, at 14:15, Eike Mueller <eike.h.mueller@googlemail.com> wrote:

...
Dear firedrakers,

I came across a problem when doing a local matrix assembly of the weak derivative matrix from a DG space (W3) to a vertical-only HDiv space (W2_vert), i.e.:

order_vertical = 0 U2 = FiniteElement('DG',interval,0) V0 = FiniteElement('CG',interval,1) V1 = FiniteElement('DG',interval,order_vertical)

W2_vert = FunctionSpace(mesh,HDiv(OuterProductElement(U2,V0))) W3 = FunctionSpace(mesh,OuterProductElement(U2,V1))

phi = TestFunction(W3) psi = TrialFunction(W3) u = TestFunction(W2_vert) w = TrialFunction(W2_vert)

form_D = phi*div(w)*dx form_DT = div(u)*psi*dx

I work with an extruded mesh with one vertical layer, for details, see the attached code which exposes the issue.

When I do a local matrix assembly for form_D and form_DT, I get the following entries of the LMA's (the grid has three cells, and I attach a 2x1 or 1x2 matrix representing the LMA at each cell, hence the three entries in the arrays below).

*** D: W2_vert -> W3 *** [[-1. 1.] [-1. 1.] [ 1. -1.]] *** DT: W3 -> W2_vert *** [[-1. -1.] [-1. -1.] [ 1. 1.]]

i.e. it looks fine for D, but the result is clearly wrong for DT, which should be the transpose of D. It does not even contain the same number of +1's and -1's. I can't really see what I'm doing wrong here. Furthermore, it does seem to work if I change set order_vertical from 0 to 1.

I have to admit that I copied the code in build_lma() method from Lawrence, without trying to understand it in detail, but I thought it was sufficiently generic to deal with any UFL forms?

Thanks a lot for any help, I'm lost as to what is going wrong here,

This is a side-effect of switching coffee optimisations on by default. The code to build the LMA matrix is slightly hacky. Basically it takes the kernel used to generate a local element tensor a matrix, and relies on being able to type pun to insert into a Dat of appropriate size.

When running with coffee optimisations on, the matrix assembly kernel is modified to pad the local element tensor to a multiple of the vector length (for aligned accesses).

But the code generation doesn't then do the right thing for inserting into the Dat (there needs to be a pack-unpack stage).

You can get what plausibly looks correct by turning coffee optimisations off (at least the number of 1s and -1s is correct). I note that there's no way to do this on a per assemble call.

It happens to work for order_vertical=1 because in that case the element tensor is /already/ a multiple of the vector length in size and hence does not get padded.

The right way to do this is probably to have an assemble LMA operator I suppose.

Lawrence

Attachments:

attachment.html (text/html — 3.9 KB)

Show replies by date

Lawrence Mitchell

5 Jan 5 Jan

12:40 p.m.

New subject: Issues with LMA of vertical derivative on extruded meshes

On 5 Jan 2015, at 12:37, Fabio Luporini <f.luporini12@imperial.ac.uk> wrote:

...

I recently (two-three weeks ago) fixed a bug concerning padding, packing and unpacking. Are you sure coffee is up-to-date? If so, I'll have a look at this in the next few days (will be back in London tomorrow) and be back to you once fixed.

Thanks!

I do not think that this is the problem. Note that we're using a kernel to assemble a /matrix/ to write into a /Dat/ that happens to have the right number of entries. It's a bit of an abuse of the pyop2 code gen, but it works for non-padded cases. Lawrence

Eike Mueller

1:02 p.m.

New subject: Issues with LMA of vertical derivative on extruded meshes

Hi Lawrence and Fabio, It works if disable the COFFEE optimisation, as suggested by Andrew on 22/12/2014. I posted this to the list (as a reply to Andrew’s email), but then I later realised that my email reporting the issue (even though I sent it *before* Andrew alerted us about the issue with COFFEE optimisation), arrived on the list *after* my reply to Andrew’s email. So at the moment I do indeed use parameters["coffee"]["O2"] = False and then the problem goes away. Apologies for the confusion! There must be a space-time anomaly causing causality violations somewhere between here and London... Thanks, Eike PS: Other emails I sent to the list also got delayed, sometimes by several hours, so that’s odd . -- Dr Eike Hermann Mueller Research Associate (PostDoc) Department of Mathematical Sciences University of Bath Bath BA2 7AY, United Kingdom +44 1225 38 5803 e.mueller@bath.ac.uk http://people.bath.ac.uk/em459/

...

On 5 Jan 2015, at 13:40, Lawrence Mitchell <lawrence.mitchell@imperial.ac.uk> wrote:

On 5 Jan 2015, at 12:37, Fabio Luporini <f.luporini12@imperial.ac.uk> wrote:

...
I recently (two-three weeks ago) fixed a bug concerning padding, packing and unpacking. Are you sure coffee is up-to-date? If so, I'll have a look at this in the next few days (will be back in London tomorrow) and be back to you once fixed.

Thanks!

I do not think that this is the problem. Note that we're using a kernel to assemble a /matrix/ to write into a /Dat/ that happens to have the right number of entries. It's a bit of an abuse of the pyop2 code gen, but it works for non-padded cases.

Lawrence _______________________________________________ firedrake mailing list firedrake@imperial.ac.uk https://mailman.ic.ac.uk/mailman/listinfo/firedrake

Lawrence Mitchell

1:46 p.m.

New subject: Issues with LMA of vertical derivative on extruded meshes

On 5 Jan 2015, at 13:02, Eike Mueller <e.mueller@bath.ac.uk> wrote:

...

Hi Lawrence and Fabio,

It works if disable the COFFEE optimisation, as suggested by Andrew on 22/12/2014. I posted this to the list (as a reply to Andrew’s email), but then I later realised that my email reporting the issue (even though I sent it *before* Andrew alerted us about the issue with COFFEE optimisation), arrived on the list *after* my reply to Andrew’s email.

So at the moment I do indeed use

parameters["coffee"]["O2"] = False

and then the problem goes away.

Apologies for the confusion! There must be a space-time anomaly causing causality violations somewhere between here and London...

So it's not a bug, per se, in coffee, but rather, as I explained an issue with optimisations that changes the kernel signature. Effectively what happens is something like: originally the kernel signature is: void foo(double A[2][1], ...); after padding it's: void foo(double A[2][2], ...); in the wrapper code generator, if we're assembling into a matrix (and we know padding has been applied we do): double buf[2][1]; double pad_buf[2][2]; foo(pad_buf, ...); buf[0][0] = pad_buf[0][0]; buf[1][0] = pad_buf[1][0]; MatSetValues(..., buf, ...); But you're assembling into a Dat and so something like this happens: double buf[2]; foo(buf, ...); dat[0..1] = buf[0..1] So the buffer that's passed to the kernel is the wrong size and even if it were the right size, you're copying the wrong values out. It would have to beL double buf[4]; foo(buf, ...); dat[0] = buf[0]; dat[1] = buf[2];

...

Thanks,

Eike

PS: Other emails I sent to the list also got delayed, sometimes by several hours, so that’s odd .

Sometimes you send from your gmail address, which is not subscribed to the list and so those mails wait until one of the moderators lets them through. Lawrence

Eike Mueller

6 Jan 6 Jan

8:52 a.m.

New subject: Issues with LMA of vertical derivative on extruded meshes

Hi Lawrence, thanks for the explanation, so my understanding is that in general PyOP2 is not aware of the COFFEE optimisation, and the wrapper code that is generated in my PyOP2 parloop can’t handle the padding. Since I call a PyOP2 parloop directly, I don’t have any control over the wrapper code and at the moment it sounds like the only thing I can do is switch off the COFFEE optimisation (which, I suppose, has implications everwhere in the code, so I really shouldn’t do this). Isn’t then there a problem with all PyOP2 parloops, and are you going to do something about this? Cheers, Eike PS: I will only send from my bath uni account in the future, sorry about that.

...

On 5 Jan 2015, at 14:46, Lawrence Mitchell <lawrence.mitchell@imperial.ac.uk> wrote:

On 5 Jan 2015, at 13:02, Eike Mueller <e.mueller@bath.ac.uk <mailto:e.mueller@bath.ac.uk>> wrote:

...
Hi Lawrence and Fabio,

It works if disable the COFFEE optimisation, as suggested by Andrew on 22/12/2014. I posted this to the list (as a reply to Andrew’s email), but then I later realised that my email reporting the issue (even though I sent it *before* Andrew alerted us about the issue with COFFEE optimisation), arrived on the list *after* my reply to Andrew’s email.

So at the moment I do indeed use

parameters["coffee"]["O2"] = False

and then the problem goes away.

Apologies for the confusion! There must be a space-time anomaly causing causality violations somewhere between here and London...

So it's not a bug, per se, in coffee, but rather, as I explained an issue with optimisations that changes the kernel signature.

Effectively what happens is something like:

originally the kernel signature is:

void foo(double A[2][1], ...);

after padding it's:

void foo(double A[2][2], ...);

in the wrapper code generator, if we're assembling into a matrix (and we know padding has been applied we do):

double buf[2][1]; double pad_buf[2][2];

foo(pad_buf, ...);

buf[0][0] = pad_buf[0][0]; buf[1][0] = pad_buf[1][0];

MatSetValues(..., buf, ...);

But you're assembling into a Dat and so something like this happens:

double buf[2];

foo(buf, ...);

dat[0..1] = buf[0..1]

So the buffer that's passed to the kernel is the wrong size and even if it were the right size, you're copying the wrong values out. It would have to beL

double buf[4]; foo(buf, ...); dat[0] = buf[0]; dat[1] = buf[2];

...
Thanks,

Eike

PS: Other emails I sent to the list also got delayed, sometimes by several hours, so that’s odd .

Sometimes you send from your gmail address, which is not subscribed to the list and so those mails wait until one of the moderators lets them through.

Lawrence _______________________________________________ firedrake mailing list firedrake@imperial.ac.uk <mailto:firedrake@imperial.ac.uk> https://mailman.ic.ac.uk/mailman/listinfo/firedrake <https://mailman.ic.ac.uk/mailman/listinfo/firedrake>

Lawrence Mitchell

9:47 a.m.

New subject: Issues with LMA of vertical derivative on extruded meshes

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 06/01/15 08:52, Eike Mueller wrote:

...

Hi Lawrence,

thanks for the explanation, so my understanding is that in general PyOP2 is not aware of the COFFEE optimisation, and the wrapper code that is generated in my PyOP2 parloop can’t handle the padding.

Yes, that's right. You can follow this issue https://github.com/coneoproject/COFFEE/issues/14 to see if anything will change. I hope so, since it's not a very nice model.

...

Since I call a PyOP2 parloop directly, I don’t have any control over the wrapper code and at the moment it sounds like the only thing I can do is switch off the COFFEE optimisation (which, I suppose, has implications everwhere in the code, so I really shouldn’t do this).

You can do the following in build_lma: def build_lma(...): parameters["coffee"]["O2"] = False existing_code parameters["coffee"]["O2"] = True That way the optimisations are only switched off for that one section.

...

Isn’t then there a problem with all PyOP2 parloops, and are you going to do something about this?

So the problem is that you're doing something that isn't explicitly supported by the PyOP2 data model: using a Kernel designed for assembling into a Mat to assemble into a Dat. It happens to work in some cases. I think it is probably useful to properly support a "build_lma_matrix" style abstraction, but I haven't thought properly about how to do it right. Lawrence -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with undefined - http://www.enigmail.net/ iQEcBAEBAgAGBQJUq68hAAoJECOc1kQ8PEYvxYsH/RyIE4fr3WKapkstsmWckEZX MzGN+C2Zl+oYp4VYDDqb8WtjNoVDl0CzqyAeywLDIUOCktFYD/HFLIcE8q3eSVIA sAAo/Lt6g0rqBaR6cFyqLWSC4DIcAJAYdEKFfHsOxRxJY1sSzOWMhdl6i35FjMpg nTZtUMWZd0gXeLXr44FW2aaHbGlJzgjFrlrLx21kXpt1B+ZM5jl2kDJ9N7gU+l8b GUQsDh/jEuOYKP9lKmLgGktIHTEBh7BTPK2jv36iSqZ2jJ7OExAszAympGPkyTd1 bYcaFXwKQ8IWjbvYfNdQLvOYgbxtGoQuFg5V34q4WOkJ7yb+QRdlyl0Jj/Som4U= =CFSf -----END PGP SIGNATURE-----

Eike Mueller

6:43 p.m.

New subject: Issues with LMA of vertical derivative on extruded meshes

Hi Lawrence, thanks, I adapted my code to make sure it only disables the COFFEE optimisation in the LMA assembly (I also had to change it in one or two other places). At the moment the banded matrix construction is part of the setup phase, so I don’t really care if it is a bit slower. I just subscribed to the github issue, so will look out for updates. Cheers, Eike -- Dr Eike Hermann Mueller Research Associate (PostDoc) Department of Mathematical Sciences University of Bath Bath BA2 7AY, United Kingdom +44 1225 38 5803 e.mueller@bath.ac.uk http://people.bath.ac.uk/em459/

...

On 6 Jan 2015, at 10:47, Lawrence Mitchell <lawrence.mitchell@imperial.ac.uk> wrote:

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1

On 06/01/15 08:52, Eike Mueller wrote:

...
Hi Lawrence,

thanks for the explanation, so my understanding is that in general PyOP2 is not aware of the COFFEE optimisation, and the wrapper code that is generated in my PyOP2 parloop can’t handle the padding.

Yes, that's right. You can follow this issue https://github.com/coneoproject/COFFEE/issues/14 to see if anything will change. I hope so, since it's not a very nice model.

...
Since I call a PyOP2 parloop directly, I don’t have any control over the wrapper code and at the moment it sounds like the only thing I can do is switch off the COFFEE optimisation (which, I suppose, has implications everwhere in the code, so I really shouldn’t do this).

You can do the following in build_lma:

def build_lma(...): parameters["coffee"]["O2"] = False existing_code parameters["coffee"]["O2"] = True

That way the optimisations are only switched off for that one section.

...
Isn’t then there a problem with all PyOP2 parloops, and are you going to do something about this?

So the problem is that you're doing something that isn't explicitly supported by the PyOP2 data model: using a Kernel designed for assembling into a Mat to assemble into a Dat. It happens to work in some cases.

I think it is probably useful to properly support a "build_lma_matrix" style abstraction, but I haven't thought properly about how to do it right.

Lawrence -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with undefined - http://www.enigmail.net/

iQEcBAEBAgAGBQJUq68hAAoJECOc1kQ8PEYvxYsH/RyIE4fr3WKapkstsmWckEZX MzGN+C2Zl+oYp4VYDDqb8WtjNoVDl0CzqyAeywLDIUOCktFYD/HFLIcE8q3eSVIA sAAo/Lt6g0rqBaR6cFyqLWSC4DIcAJAYdEKFfHsOxRxJY1sSzOWMhdl6i35FjMpg nTZtUMWZd0gXeLXr44FW2aaHbGlJzgjFrlrLx21kXpt1B+ZM5jl2kDJ9N7gU+l8b GUQsDh/jEuOYKP9lKmLgGktIHTEBh7BTPK2jv36iSqZ2jJ7OExAszAympGPkyTd1 bYcaFXwKQ8IWjbvYfNdQLvOYgbxtGoQuFg5V34q4WOkJ7yb+QRdlyl0Jj/Som4U= =CFSf -----END PGP SIGNATURE-----

_______________________________________________ firedrake mailing list firedrake@imperial.ac.uk https://mailman.ic.ac.uk/mailman/listinfo/firedrake

3951

Age (days ago)

3952

Last active (days ago)

List overview

Download

6 comments

3 participants

participants (3)

Eike Mueller
Fabio Luporini
Lawrence Mitchell