memory usage and computation time in Taylor-Green problem
Hi all, After obtaining exciting error vs. computational cost results from relatively heavy 2D computations, I've been setting up the 3D Taylor Green vortex (TGV) problem using Nektar++ Incompressible solver. The Reynolds number used is 1600 (same as in the TGV tutorial from Nektar++ team). A 3D grid of 64^3 elements with NumModes=5 was used to obtain a 256^3 simulation (close to a DNS). I ran this problem on Midlands+ Tier 2 machine using 4 nodes (112 cores, 512 GB RAM in total). http://www.hpc-midlands-plus.ac.uk/about/system-description/ The memory consumption during the run was ~450GB and it took ~330 wall-clock-minutes for 1000 time-steps (with dt=1e-4). For a reference, I ran a 256^3 TGV simulation of the same case in OpenFOAM using same resources. The memory consumption was ~40GB and it took ~20 minutes for 1000 time-steps. So for this case, I observe high resource consumption in terms of memory and computation time. I was not expecting this on the basis of my experience with Nektar++ on 2D simulations, at least in terms of computation time. Is this normal? In addition to my set-up file, I'm suspecting my installation of Nektar++ on the aforementioned cluster. Could anyone please try and run a few hundred time-steps on their machines using the files attached? FYI, I'm partitioning on-the-fly. The calculation seems to scale well on 8 nodes in terms of wall clock time while increasing the memory consumption by ~10%. Any input is highly appreciated. Best regards, Vishal --- *Vishal SAINI* Master of Research, University of Cambridge.
Dear Vishal, Sorry for not getting to your comments earlier. Can I first confirm you have indeed compiled the code in parallel since what you describe sounds as if the code may have been compiled in serial and is running 112 versions of the case? To turn on parallelisation you need to enable the NEKTAR_USE_MPI in the cmake step. Best regards, Spencer. On 19 Mar 2018, at 17:21, Vishal Saini <vishal.saini.nitj@gmail.com<mailto:vishal.saini.nitj@gmail.com>> wrote: Hi all, After obtaining exciting error vs. computational cost results from relatively heavy 2D computations, I've been setting up the 3D Taylor Green vortex (TGV) problem using Nektar++ Incompressible solver. The Reynolds number used is 1600 (same as in the TGV tutorial from Nektar++ team). A 3D grid of 64^3 elements with NumModes=5 was used to obtain a 256^3 simulation (close to a DNS). I ran this problem on Midlands+ Tier 2 machine using 4 nodes (112 cores, 512 GB RAM in total). http://www.hpc-midlands-plus.ac.uk/about/system-description/ The memory consumption during the run was ~450GB and it took ~330 wall-clock-minutes for 1000 time-steps (with dt=1e-4). For a reference, I ran a 256^3 TGV simulation of the same case in OpenFOAM using same resources. The memory consumption was ~40GB and it took ~20 minutes for 1000 time-steps. So for this case, I observe high resource consumption in terms of memory and computation time. I was not expecting this on the basis of my experience with Nektar++ on 2D simulations, at least in terms of computation time. Is this normal? In addition to my set-up file, I'm suspecting my installation of Nektar++ on the aforementioned cluster. Could anyone please try and run a few hundred time-steps on their machines using the files attached? FYI, I'm partitioning on-the-fly. The calculation seems to scale well on 8 nodes in terms of wall clock time while increasing the memory consumption by ~10%. Any input is highly appreciated. Best regards, Vishal --- Vishal SAINI Master of Research, University of Cambridge. <nektar_user_tgv.tar.gz>_______________________________________________ Nektar-users mailing list Nektar-users@imperial.ac.uk<mailto:Nektar-users@imperial.ac.uk> https://mailman.ic.ac.uk/mailman/listinfo/nektar-users Spencer Sherwin FREng, FRAeS Head, Aerodynamics, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus, London, SW7 2AZ, UK s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0)20 7594 5052 http://www.imperial.ac.uk/people/s.sherwin/
Dear Spencer, Thank you for your reply. Yes, I have compiled Nektar++ with MPI (Although I get a warning on the cluster: WARN Conflicting CPU frequencies detected, using: 2900.06. The admin claims it's harmless and they're working on it). For reference, I'm attaching the output log in compact and verbose mode, where the code seems to partition the mesh (??). Command run: """ *JobID: 124250======Time: Thu 15 Mar 12:09:00 GMT 2018Running on master node: node0367Current directory: /gpfs/home/lboro/ttvs3/work_Saini/cases_Nektar/tgv/Re1600_LES/N5/256cubenumtasks=112, numnodes=4, mpi_tasks_per_node=28 (OMP_NUM_THREADS=1)Executing command:==================time mpirun -npernode 28 -np 112 IncNavierStokesSolver -v 2pi_domain_64cubeEle_mesh_copyPaste.xml tgv_conditions_lowTolerance.xml > log_124250* """ It is a bit unlikely that there are 112 instances running in parallel because it is 256^3~17M degrees of freedom case and each instance most probably is bigger than 450GB/112~4GB. Hope these are helpful. Regards, Vishal On Sun, Mar 25, 2018 at 7:09 PM, Sherwin, Spencer J < s.sherwin@imperial.ac.uk> wrote:
Dear Vishal,
Sorry for not getting to your comments earlier. Can I first confirm you have indeed compiled the code in parallel since what you describe sounds as if the code may have been compiled in serial and is running 112 versions of the case?
To turn on parallelisation you need to enable the NEKTAR_USE_MPI in the cmake step.
Best regards, Spencer.
On 19 Mar 2018, at 17:21, Vishal Saini <vishal.saini.nitj@gmail.com> wrote:
Hi all,
After obtaining exciting error vs. computational cost results from relatively heavy 2D computations, I've been setting up the 3D Taylor Green vortex (TGV) problem using Nektar++ Incompressible solver. The Reynolds number used is 1600 (same as in the TGV tutorial from Nektar++ team). A 3D grid of 64^3 elements with NumModes=5 was used to obtain a 256^3 simulation (close to a DNS). I ran this problem on Midlands+ Tier 2 machine using 4 nodes (112 cores, 512 GB RAM in total). http://www.hpc-midlands-plus.ac.uk/about/system-description/
The memory consumption during the run was ~450GB and it took ~330 wall-clock-minutes for 1000 time-steps (with dt=1e-4). For a reference, I ran a 256^3 TGV simulation of the same case in OpenFOAM using same resources. The memory consumption was ~40GB and it took ~20 minutes for 1000 time-steps.
So for this case, I observe high resource consumption in terms of memory and computation time. I was not expecting this on the basis of my experience with Nektar++ on 2D simulations, at least in terms of computation time.
Is this normal? In addition to my set-up file, I'm suspecting my installation of Nektar++ on the aforementioned cluster. Could anyone please try and run a few hundred time-steps on their machines using the files attached? FYI, I'm partitioning on-the-fly. The calculation seems to scale well on 8 nodes in terms of wall clock time while increasing the memory consumption by ~10%.
Any input is highly appreciated.
Best regards, Vishal
--- * Vishal SAINI* Master of Research, University of Cambridge. <nektar_user_tgv.tar.gz>_______________________________________________ Nektar-users mailing list Nektar-users@imperial.ac.uk https://mailman.ic.ac.uk/mailman/listinfo/nektar-users
Spencer Sherwin FREng, FRAeS Head, Aerodynamics, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus, London, SW7 2AZ, UK s.sherwin@imperial.ac.uk +44 (0)20 7594 5052 <+44%2020%207594%205052> http://www.imperial.ac.uk/people/s.sherwin/
Hi Vishal, Dave Moxey emailed me earlier to say he had tried it too and there are some rather obvious memory mismanagement we need to sort out for the Hex meshes. Whilst we do this have you tried running this case with the 2.5D solver as shown in the tutorial. This has similar performance to the 2D solver. I will also have a look at Dave suggestions and get back to you. Cheers Spencer Sent from my iPhone On 26 Mar 2018, at 18:35, Vishal Saini <vishal.saini.nitj@gmail.com<mailto:vishal.saini.nitj@gmail.com>> wrote: Dear Spencer, Thank you for your reply. Yes, I have compiled Nektar++ with MPI (Although I get a warning on the cluster: WARN Conflicting CPU frequencies detected, using: 2900.06. The admin claims it's harmless and they're working on it). For reference, I'm attaching the output log in compact and verbose mode, where the code seems to partition the mesh (??). Command run: """ JobID: 124250 ====== Time: Thu 15 Mar 12:09:00 GMT 2018 Running on master node: node0367 Current directory: /gpfs/home/lboro/ttvs3/work_Saini/cases_Nektar/tgv/Re1600_LES/N5/256cube numtasks=112, numnodes=4, mpi_tasks_per_node=28 (OMP_NUM_THREADS=1) Executing command: ================== time mpirun -npernode 28 -np 112 IncNavierStokesSolver -v 2pi_domain_64cubeEle_mesh_copyPaste.xml tgv_conditions_lowTolerance.xml > log_124250 """ It is a bit unlikely that there are 112 instances running in parallel because it is 256^3~17M degrees of freedom case and each instance most probably is bigger than 450GB/112~4GB. Hope these are helpful. Regards, Vishal On Sun, Mar 25, 2018 at 7:09 PM, Sherwin, Spencer J <s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk>> wrote: Dear Vishal, Sorry for not getting to your comments earlier. Can I first confirm you have indeed compiled the code in parallel since what you describe sounds as if the code may have been compiled in serial and is running 112 versions of the case? To turn on parallelisation you need to enable the NEKTAR_USE_MPI in the cmake step. Best regards, Spencer. On 19 Mar 2018, at 17:21, Vishal Saini <vishal.saini.nitj@gmail.com<mailto:vishal.saini.nitj@gmail.com>> wrote: Hi all, After obtaining exciting error vs. computational cost results from relatively heavy 2D computations, I've been setting up the 3D Taylor Green vortex (TGV) problem using Nektar++ Incompressible solver. The Reynolds number used is 1600 (same as in the TGV tutorial from Nektar++ team). A 3D grid of 64^3 elements with NumModes=5 was used to obtain a 256^3 simulation (close to a DNS). I ran this problem on Midlands+ Tier 2 machine using 4 nodes (112 cores, 512 GB RAM in total). http://www.hpc-midlands-plus.ac.uk/about/system-description/ The memory consumption during the run was ~450GB and it took ~330 wall-clock-minutes for 1000 time-steps (with dt=1e-4). For a reference, I ran a 256^3 TGV simulation of the same case in OpenFOAM using same resources. The memory consumption was ~40GB and it took ~20 minutes for 1000 time-steps. So for this case, I observe high resource consumption in terms of memory and computation time. I was not expecting this on the basis of my experience with Nektar++ on 2D simulations, at least in terms of computation time. Is this normal? In addition to my set-up file, I'm suspecting my installation of Nektar++ on the aforementioned cluster. Could anyone please try and run a few hundred time-steps on their machines using the files attached? FYI, I'm partitioning on-the-fly. The calculation seems to scale well on 8 nodes in terms of wall clock time while increasing the memory consumption by ~10%. Any input is highly appreciated. Best regards, Vishal --- Vishal SAINI Master of Research, University of Cambridge. <nektar_user_tgv.tar.gz>_______________________________________________ Nektar-users mailing list Nektar-users@imperial.ac.uk<mailto:Nektar-users@imperial.ac.uk> https://mailman.ic.ac.uk/mailman/listinfo/nektar-users Spencer Sherwin FREng, FRAeS Head, Aerodynamics, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus, London, SW7 2AZ, UK s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0)20 7594 5052<tel:+44%2020%207594%205052> http://www.imperial.ac.uk/people/s.sherwin/ <log.dat> <log_verbose.dat>
Hi again Spencer, I'm happy that the issue is identified. Regarding the case, I deliberately chose a truly 3D mesh because the ultimate aim of my project is related to flow geometries that may not be represented by homogeneous modes (gas turbine combustor like). Looking forward to the updates on the matter as it'll help me a lot. Regards, Vishal On Mon, Mar 26, 2018 at 6:40 PM, Sherwin, Spencer J < s.sherwin@imperial.ac.uk> wrote:
Hi Vishal,
Dave Moxey emailed me earlier to say he had tried it too and there are some rather obvious memory mismanagement we need to sort out for the Hex meshes. Whilst we do this have you tried running this case with the 2.5D solver as shown in the tutorial. This has similar performance to the 2D solver.
I will also have a look at Dave suggestions and get back to you.
Cheers Spencer
Sent from my iPhone
On 26 Mar 2018, at 18:35, Vishal Saini <vishal.saini.nitj@gmail.com> wrote:
Dear Spencer,
Thank you for your reply. Yes, I have compiled Nektar++ with MPI (Although I get a warning on the cluster: WARN Conflicting CPU frequencies detected, using: 2900.06. The admin claims it's harmless and they're working on it).
For reference, I'm attaching the output log in compact and verbose mode, where the code seems to partition the mesh (??).
Command run: """
* JobID: 124250 ====== Time: Thu 15 Mar 12:09:00 GMT 2018 Running on master node: node0367 Current directory: /gpfs/home/lboro/ttvs3/work_Saini/cases_Nektar/tgv/Re1600_LES/N5/256cube numtasks=112, numnodes=4, mpi_tasks_per_node=28 (OMP_NUM_THREADS=1) Executing command: ================== time mpirun -npernode 28 -np 112 IncNavierStokesSolver -v 2pi_domain_64cubeEle_mesh_copyPaste.xml tgv_conditions_lowTolerance.xml > log_124250*
"""
It is a bit unlikely that there are 112 instances running in parallel because it is 256^3~17M degrees of freedom case and each instance most probably is bigger than 450GB/112~4GB.
Hope these are helpful.
Regards, Vishal
On Sun, Mar 25, 2018 at 7:09 PM, Sherwin, Spencer J < s.sherwin@imperial.ac.uk> wrote:
Dear Vishal,
Sorry for not getting to your comments earlier. Can I first confirm you have indeed compiled the code in parallel since what you describe sounds as if the code may have been compiled in serial and is running 112 versions of the case?
To turn on parallelisation you need to enable the NEKTAR_USE_MPI in the cmake step.
Best regards, Spencer.
On 19 Mar 2018, at 17:21, Vishal Saini <vishal.saini.nitj@gmail.com> wrote:
Hi all,
After obtaining exciting error vs. computational cost results from relatively heavy 2D computations, I've been setting up the 3D Taylor Green vortex (TGV) problem using Nektar++ Incompressible solver. The Reynolds number used is 1600 (same as in the TGV tutorial from Nektar++ team). A 3D grid of 64^3 elements with NumModes=5 was used to obtain a 256^3 simulation (close to a DNS). I ran this problem on Midlands+ Tier 2 machine using 4 nodes (112 cores, 512 GB RAM in total). http://www.hpc-midlands-plus.ac.uk/about/system-description/
The memory consumption during the run was ~450GB and it took ~330 wall-clock-minutes for 1000 time-steps (with dt=1e-4). For a reference, I ran a 256^3 TGV simulation of the same case in OpenFOAM using same resources. The memory consumption was ~40GB and it took ~20 minutes for 1000 time-steps.
So for this case, I observe high resource consumption in terms of memory and computation time. I was not expecting this on the basis of my experience with Nektar++ on 2D simulations, at least in terms of computation time.
Is this normal? In addition to my set-up file, I'm suspecting my installation of Nektar++ on the aforementioned cluster. Could anyone please try and run a few hundred time-steps on their machines using the files attached? FYI, I'm partitioning on-the-fly. The calculation seems to scale well on 8 nodes in terms of wall clock time while increasing the memory consumption by ~10%.
Any input is highly appreciated.
Best regards, Vishal
--- * Vishal SAINI* Master of Research, University of Cambridge. <nektar_user_tgv.tar.gz>_______________________________________________ Nektar-users mailing list Nektar-users@imperial.ac.uk https://mailman.ic.ac.uk/mailman/listinfo/nektar-users
Spencer Sherwin FREng, FRAeS Head, Aerodynamics, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus, London, SW7 2AZ, UK s.sherwin@imperial.ac.uk +44 (0)20 7594 5052 <+44%2020%207594%205052> http://www.imperial.ac.uk/people/s.sherwin/
<log.dat>
<log_verbose.dat>
participants (2)
- 
                
                Sherwin, Spencer J
- 
                
                Vishal Saini