running in Parallel - firedrake - imperial.ac.uk

newer
netCDF4 error

running in Parallel

older
Simplest way to access cell2vertex...

Floriane Gidel [RPG]

5 Sep 2017 5 Sep '17

9:05 a.m.

Dear all, I need to run my firedrake code in parallel but some of the commands (like appending an array) can't be executed in parallel. How can I force some lines to be executed by one core only? Thanks in advance, Floriane

Attachments:

attachment.html (text/html — 640 bytes)

Reply

Sign in to reply online Use email software

Show replies by date

Ham, David A

5 Sep 5 Sep

9:42 a.m.

Hi Floriane, can you provide a minimal example? Appending to an array isn't a Firedrake operation so it sounds like the issue is more a question of how you need to parallelise the surrounding code, and that will depend on exactly what it is you are doing. Regards, David On Tue, 2017-09-05 at 08:05 +0000, Floriane Gidel [RPG] wrote: Dear all, I need to run my firedrake code in parallel but some of the commands (like appending an array) can't be executed in parallel. How can I force some lines to be executed by one core only? Thanks in advance, Floriane _______________________________________________ firedrake mailing list firedrake@imperial.ac.uk<mailto:firedrake@imperial.ac.uk> https://mailman.ic.ac.uk/mailman/listinfo/firedrake

Reply

Sign in to reply online Use email software

Floriane Gidel [RPG]

12:43 p.m.

Hi David, This is a big piece of code where I solve nonlinear equations in a 2D domain with mixed systems involving vector functions. I'd like to run the whole code in parallel except from some lines such as saving txt files, to avoid overwriting it or saving in several files. Is there a command to select a specific process? Andrew once mentioned using op2.MPI.comm.rank. Is that a solution? How can I use it? The error when appending the array is not such an issue as I need it only for visualisation purpose. For instance, I want to save a 1D function h_1D mesh_1D = IntervalMesh(Nx,Lx) V_1D = FunctionSpace(mesh_1D, "CG", 1) h_1D = Function(V_1D) on a 2D surface. So I define a 2D mesh as follows mesh_2D = RectangleMesh(Nx,1,Lx,1.0,quadrilateral=True) V_2D = FunctionSpace(mesh_2D, "CG", 1) h_2D = Function(V_2D) Then, I save in Indx[i] the indices for which x_2D[i] = x_1D[i]: Indx = [] for i in range(Nx+1): Indx.append([item for item in range(len(mesh_2D.coordinates.dat.data[:,0])) if mesh_2D.coordinates.dat.data[item,0] == mesh_1D.coordinates.dat.data[i]]) The idea is that I can then project h_1D to the 2D mesh through h_2D: for i in range(Nx+1): h_2D.dat.data[Indx[i]] = h_1D.dat.data[i] This works fine when I run the code on one core, but when running in parallel I get an index error: File "main.py", line 932, in <listcomp> Indx.append([item for item in range(len(mesh_2D.coordinates.dat.data[:,0])) if mesh_2D.coordinates.dat.data[item,0] == mesh_1D.coordinates.dat.data[j]]) IndexError: index 125 is out of bounds for axis 0 with size 125 I can comment it, but then I get other (Firedrake-related) errors that I don't get otherwise, for example: File "main.py", line 970, in <module> mesh_WM = CubeMesh(1,1,1,1.0) File "/home/ufaserv1_i/mmfg/firedrake/firedrake/src/firedrake/firedrake/utility_meshes.py", line 749, in CubeMesh comm=comm) File "/home/ufaserv1_i/mmfg/firedrake/firedrake/src/firedrake/firedrake/utility_meshes.py", line 725, in BoxMesh return mesh.Mesh(plex, reorder=reorder, distribute=distribute) File "<decorator-gen-259>", line 2, in Mesh File "/home/ufaserv1_i/mmfg/firedrake/firedrake/src/PyOP2/pyop2/profiling.py", line 60, in wrapper return f(*args, **kwargs) File "/home/ufaserv1_i/mmfg/firedrake/firedrake/src/firedrake/firedrake/mesh.py", line 1249, in Mesh topology = MeshTopology(plex, name=name, reorder=reorder, distribute=distribute) File "<decorator-gen-257>", line 2, in __init__ File "/home/ufaserv1_i/mmfg/firedrake/firedrake/src/PyOP2/pyop2/profiling.py", line 60, in wrapper return f(*args, **kwargs) File "/home/ufaserv1_i/mmfg/firedrake/firedrake/src/firedrake/firedrake/mesh.py", line 377, in __init__ raise RuntimeError("Mesh must have at least one cell on every process") RuntimeError: Mesh must have at least one cell on every process I don't understand while running in parallel causes errors for the mesh definition? Thank you, Floriane ________________________________ De : firedrake-bounces@imperial.ac.uk <firedrake-bounces@imperial.ac.uk> de la part de Ham, David A <david.ham@imperial.ac.uk> Envoyé : mardi 5 septembre 2017 09:42:44 À : firedrake Objet : Re: [firedrake] running in Parallel Hi Floriane, can you provide a minimal example? Appending to an array isn't a Firedrake operation so it sounds like the issue is more a question of how you need to parallelise the surrounding code, and that will depend on exactly what it is you are doing. Regards, David On Tue, 2017-09-05 at 08:05 +0000, Floriane Gidel [RPG] wrote: Dear all, I need to run my firedrake code in parallel but some of the commands (like appending an array) can't be executed in parallel. How can I force some lines to be executed by one core only? Thanks in advance, Floriane _______________________________________________ firedrake mailing list firedrake@imperial.ac.uk<mailto:firedrake@imperial.ac.uk> https://mailman.ic.ac.uk/mailman/listinfo/firedrake

Reply

Sign in to reply online Use email software

Lawrence Mitchell

1:43 p.m.

Hi Floriane, On 05/09/17 12:43, Floriane Gidel [RPG] wrote:

Hi David,

This is a big piece of code where I solve nonlinear equations in a 2D domain with mixed systems involving vector functions.

I'd like to run the whole code in parallel except from some lines such as saving txt files, to avoid overwriting it or saving in several files. Is there a command to select a specific process? Andrew once mentioned using op2.MPI.comm.rank. Is that a solution? How can I use it?

The terminology that MPI uses is that you have a *communicator* comprised of some number of processes, each of these is indexed by its *rank*. All Firedrake objects should have the communicator they are defined on accessible as the .comm attribute (if you find one that is not, please report a bug). When you build a mesh, and don't provide a communicator, it is built on the default COMM_WORLD communicator. Objects built on top of the mesh inherit the mesh's communicator. So if you have a mesh, and only want to print something on rank 0. You can write: if mesh.comm.rank == 0: print("I am on rank 0") An important thing to note: Most operations that you do in firedrake are *collective* over the communicator of the participating objects. That means that all ranks need to participate, otherwise you will get errors or your code might hang. So for example you can't do this: mesh = Mesh(...) V = FunctionSpace(mesh, ...) v = TestFunction(V) f = assemble(v*dx) if f.comm.rank == 0: # This line collectively accesses the data in f # it therefore will hang. print(f.dat.data) You can do: # Everyone accesses data, rank 0 prints something fdata = f.dat.data if f.comm.rank == 0: print(fdata)

The error when appending the array is not such an issue as I need it only for visualisation purpose.

For instance, I want to save a 1D function h_1D

mesh_1D = IntervalMesh(Nx,Lx)

V_1D = FunctionSpace(mesh_1D, "CG", 1)

h_1D = Function(V_1D)

on a 2D surface. So I define a 2D mesh as follows

mesh_2D = RectangleMesh(Nx,1,Lx,1.0,quadrilateral=True)

V_2D = FunctionSpace(mesh_2D, "CG", 1)

h_2D = Function(V_2D)

Then, I save in Indx[i] the indices for which x_2D[i] = x_1D[i]:

Indx = []

fori inrange(Nx+1):

Indx.append([item foritem inrange(len(mesh_2D.coordinates.dat.data[:,0])) ifmesh_2D.coordinates.dat.data[item,0] == mesh_1D.coordinates.dat.data[i]])

The idea is that I can then project h_1D to the 2D mesh through h_2D:

fori inrange(Nx+1):

h_2D.dat.data[Indx[i]] = h_1D.dat.data[i]

This works fine when I run the code on one core, but when running in parallel I get an index error:

File "main.py", line 932, in <listcomp>

Indx.append([item for item in range(len(mesh_2D.coordinates.dat.data[:,0])) if mesh_2D.coordinates.dat.data[item,0] == mesh_1D.coordinates.dat.data[j]])

IndexError: index 125 is out of bounds for axis 0 with size 125

Both the 2D mesh and the 1D mesh are distributed amongst the MPI processes, so you have no guarantee that there are Nx+1 entries in the 1D coordinate array. You probably want: for i in range(len(mesh_1D.coordinates.data.data)): Indx.append(...) The same is true of writing: Once you have made the Indx array, I would just write: h_2D.dat.data[Index, 0] = h_1D.dat.data_ro[:]

I can comment it, but then I get other (Firedrake-related) errors that I don't get otherwise, for example:

File "main.py", line 970, in <module>

mesh_WM= CubeMesh(1,1,1,1.0)

...

raise RuntimeError("Mesh must have at least one cell on every process")

RuntimeError: Mesh must have at least one cell on every process

I don't understand while running in parallel causes errors for the mesh definition?

You made a 1x1x1 cube mesh which has 6 cells. If you tried to run on more than 6 processes (mpiexec -n 7, for example), then you would get this error, because every process must own at least one cell. Possibly even with fewer than 6 processes you might get this error depending on the exact mesh decomposition. Cheers, Lawrence

Reply

Sign in to reply online Use email software

Floriane Gidel [RPG]

1:50 p.m.

Hi Lawrence, Thank you so much for this very complete answer. That helps a lot! I'll try what you suggested and come back to you if necessary, but it is much more clear now. Cheers, Floriane ________________________________ De : firedrake-bounces@imperial.ac.uk <firedrake-bounces@imperial.ac.uk> de la part de Lawrence Mitchell <lawrence.mitchell@imperial.ac.uk> Envoyé : mardi 5 septembre 2017 13:43:24 À : firedrake@imperial.ac.uk Objet : Re: [firedrake] running in Parallel Hi Floriane, On 05/09/17 12:43, Floriane Gidel [RPG] wrote:

Hi David,

This is a big piece of code where I solve nonlinear equations in a 2D domain with mixed systems involving vector functions.

I'd like to run the whole code in parallel except from some lines such as saving txt files, to avoid overwriting it or saving in several files. Is there a command to select a specific process? Andrew once mentioned using op2.MPI.comm.rank. Is that a solution? How can I use it?

The terminology that MPI uses is that you have a *communicator* comprised of some number of processes, each of these is indexed by its *rank*. All Firedrake objects should have the communicator they are defined on accessible as the .comm attribute (if you find one that is not, please report a bug). When you build a mesh, and don't provide a communicator, it is built on the default COMM_WORLD communicator. Objects built on top of the mesh inherit the mesh's communicator. So if you have a mesh, and only want to print something on rank 0. You can write: if mesh.comm.rank == 0: print("I am on rank 0") An important thing to note: Most operations that you do in firedrake are *collective* over the communicator of the participating objects. That means that all ranks need to participate, otherwise you will get errors or your code might hang. So for example you can't do this: mesh = Mesh(...) V = FunctionSpace(mesh, ...) v = TestFunction(V) f = assemble(v*dx) if f.comm.rank == 0: # This line collectively accesses the data in f # it therefore will hang. print(f.dat.data) You can do: # Everyone accesses data, rank 0 prints something fdata = f.dat.data if f.comm.rank == 0: print(fdata)

The error when appending the array is not such an issue as I need it only for visualisation purpose.

For instance, I want to save a 1D function h_1D

mesh_1D = IntervalMesh(Nx,Lx)

V_1D = FunctionSpace(mesh_1D, "CG", 1)

h_1D = Function(V_1D)

on a 2D surface. So I define a 2D mesh as follows

mesh_2D = RectangleMesh(Nx,1,Lx,1.0,quadrilateral=True)

V_2D = FunctionSpace(mesh_2D, "CG", 1)

h_2D = Function(V_2D)

Then, I save in Indx[i] the indices for which x_2D[i] = x_1D[i]:

Indx = []

fori inrange(Nx+1):

Indx.append([item foritem inrange(len(mesh_2D.coordinates.dat.data[:,0])) ifmesh_2D.coordinates.dat.data[item,0] == mesh_1D.coordinates.dat.data[i]])

The idea is that I can then project h_1D to the 2D mesh through h_2D:

fori inrange(Nx+1):

h_2D.dat.data[Indx[i]] = h_1D.dat.data[i]

This works fine when I run the code on one core, but when running in parallel I get an index error:

File "main.py", line 932, in <listcomp>

Indx.append([item for item in range(len(mesh_2D.coordinates.dat.data[:,0])) if mesh_2D.coordinates.dat.data[item,0] == mesh_1D.coordinates.dat.data[j]])

IndexError: index 125 is out of bounds for axis 0 with size 125

Both the 2D mesh and the 1D mesh are distributed amongst the MPI processes, so you have no guarantee that there are Nx+1 entries in the 1D coordinate array. You probably want: for i in range(len(mesh_1D.coordinates.data.data)): Indx.append(...) The same is true of writing: Once you have made the Indx array, I would just write: h_2D.dat.data[Index, 0] = h_1D.dat.data_ro[:]

I can comment it, but then I get other (Firedrake-related) errors that I don't get otherwise, for example:

File "main.py", line 970, in <module>

mesh_WM= CubeMesh(1,1,1,1.0)

...

raise RuntimeError("Mesh must have at least one cell on every process")

RuntimeError: Mesh must have at least one cell on every process

I don't understand while running in parallel causes errors for the mesh definition?

You made a 1x1x1 cube mesh which has 6 cells. If you tried to run on more than 6 processes (mpiexec -n 7, for example), then you would get this error, because every process must own at least one cell. Possibly even with fewer than 6 processes you might get this error depending on the exact mesh decomposition. Cheers, Lawrence _______________________________________________ firedrake mailing list firedrake@imperial.ac.uk https://mailman.ic.ac.uk/mailman/listinfo/firedrake

Reply

Sign in to reply online Use email software

2978

Age (days ago)

2978

Last active (days ago)

Download

4 comments

3 participants

tags

participants (3)

Floriane Gidel [RPG]
Ham, David A
Lawrence Mitchell