Installing Nektar++ 5.2.0 on a cluster
Hi Parv Thank you for your kind response. I managed to install ver. 5.3 using intel compiler and OpenMPI 4.0.3 at last. However, I am still getting MPI-related issues when running a medium-sized mesh with 760K cells (success with a smaller mesh in parallel): =======================================================================
EquationType: UnsteadyNavierStokes Session Name: clusteredToGmsh Spatial Dim.: 3 Max SEM Exp. Order: 4 Num. Processes: 32 Expansion Dim.: 3 Projection Type: Continuous Galerkin Advect. advancement: explicit Diffuse. advancement: implicit Time Step: 0.0001 No. of Steps: 500000 Checkpoints (steps): 10000 Integration Type: IMEX Splitting Scheme: Velocity correction (strong press. form) ======================================================================= Initial Conditions: - Field u: 1.0 - Field v: 0.0 - Field w: 0.0 - Field p: 0.0 -------------------------------------------------------------------------- Primary job terminated normally, but 1 process returned a non-zero exit code. Per user-direction, the job has been aborted. -------------------------------------------------------------------------- -------------------------------------------------------------------------- mpirun noticed that process rank 0 with PID 0 on node cra1080 exited on signal 9 (Killed).
I was suspicious that it might be a problem with the inlet compiler and so I switched to GCC 9.3 for a fresh installation. But it seems that GCC is causing problems and I get the following error at some point: ../../SpatialDomains/libSpatialDomains.so.5.3.0: error: undefined reference
to '__intel_sse2_strcpy' ../../SpatialDomains/libSpatialDomains.so.5.3.0: error: undefined reference to '_intel_fast_memmove' ../../SpatialDomains/libSpatialDomains.so.5.3.0: error: undefined reference to '__intel_sse2_strlen' ../../SpatialDomains/libSpatialDomains.so.5.3.0: error: undefined reference to '_intel_fast_memcpy' ../../SpatialDomains/libSpatialDomains.so.5.3.0: error: undefined reference to '_intel_fast_memset' collect2: error: ld returned 1 exit status make[2]: *** [library/Demos/SpatialDomains/CMakeFiles/PartitionAnalyse.dir/build.make:127: library/Demos/SpatialDomains/PartitionAnalyse] Error 1 make[1]: *** [CMakeFiles/Makefile2:3681: library/Demos/SpatialDomains/CMakeFiles/PartitionAnalyse.dir/all] Error 2 make[1]: *** Waiting for unfinished jobs....
I discussed the mesh problem with James Slaughter prior to this and he suggested that the problem might be due to a bug in a lower version (5.0.3) that I was working with. That is why I decided to go with the most recent version. After all, I am still struggling with my parallel simulations! Kind regards syavash On Thu, Mar 9, 2023 at 4:57 PM Khurana, Parv <p.khurana22@imperial.ac.uk> wrote:
Hi Syavash,
A few questions pop up on my mind on seeing this:
1. What compilers are you using (GCC or Intel?) 2. Are you loading the modules which are compatible with the compiler you are using? 3. Do you have a version of OpenBLAS or MKL already loaded as one of the modules on your cluster?
As often is the case, the problem might be more detailed and it’ll be great to see the modules and cmake commands you are using for you installation in order to debug this properly. Happy to hop on a call if needed!
Best
Parv
*From:* nektar-users-bounces@imperial.ac.uk < nektar-users-bounces@imperial.ac.uk> *On Behalf Of *Ehsan Asgari *Sent:* 09 March 2023 09:54 *To:* nektar-users <nektar-users@imperial.ac.uk> *Subject:* [Nektar-users] Installing Nektar++ 5.2.0 on a cluster
This email from eh.asgari@gmail.com originates from outside Imperial. Do not click on links and attachments unless you recognise the sender. If you trust the sender, add them to your safe senders list <https://spam.ic.ac.uk/SpamConsole/Senders.aspx> to disable email stamping for this address.
Hi Everyone,
I am trying to install the latest version of Nektar on a cluster. However, I get the following error at some point:
../../library/SpatialDomains/libSpatialDomains.so.5.3.0: error: undefined reference to '__intel_sse2_strlen' ../../library/SpatialDomains/libSpatialDomains.so.5.3.0: error: undefined reference to '_intel_fast_memcpy' ../../library/SpatialDomains/libSpatialDomains.so.5.3.0: error: undefined reference to '_intel_fast_memset' collect2: error: ld returned 1 exit status make[2]: *** [utilities/NekMesh/CMakeFiles/NekMesh.dir/build.make:106: utilities/NekMesh/NekMesh] Error 1 make[1]: *** [CMakeFiles/Makefile2:1647: utilities/NekMesh/CMakeFiles/NekMesh.dir/all] Error 2 make: *** [Makefile:141: all] Error 2
I had "NEKTAR_USE_SYSTEM_BLAS_LAPACK:BOOL=ON " and "THIRDPARTY_BUILD_BLAS_LAPACK:BOOL=ON" in the ccmake as per suggested in the user archives.
I appreciate your kind help.
Kind regards
syavash
Hi Syavash, Can you run ldd the IncNavierStokesSolver and put module list as an extra line in your submission script before the solver execution and run again and send through the output? It looks like you’ve compiled at least Spatial Domains with an intel specific instruction set and these can’t be found at runtime. Kind regards, James. From: nektar-users-bounces@imperial.ac.uk <nektar-users-bounces@imperial.ac.uk> On Behalf Of Ehsan Asgari Sent: Saturday, March 11, 2023 8:32 AM To: nektar-users <nektar-users@imperial.ac.uk> Subject: [Nektar-users] Installing Nektar++ 5.2.0 on a cluster Hi Parv Thank you for your kind response. I managed to install ver. 5.3 using intel compiler and OpenMPI 4.0.3 at last. However, I am still getting MPI-related issues when running a medium-sized mesh with 760K cells (success with a smaller mesh in parallel): ======================================================================= EquationType: UnsteadyNavierStokes Session Name: clusteredToGmsh Spatial Dim.: 3 Max SEM Exp. Order: 4 Num. Processes: 32 Expansion Dim.: 3 Projection Type: Continuous Galerkin Advect. advancement: explicit Diffuse. advancement: implicit Time Step: 0.0001 No. of Steps: 500000 Checkpoints (steps): 10000 Integration Type: IMEX Splitting Scheme: Velocity correction (strong press. form) ======================================================================= Initial Conditions: - Field u: 1.0 - Field v: 0.0 - Field w: 0.0 - Field p: 0.0 -------------------------------------------------------------------------- Primary job terminated normally, but 1 process returned a non-zero exit code. Per user-direction, the job has been aborted. -------------------------------------------------------------------------- -------------------------------------------------------------------------- mpirun noticed that process rank 0 with PID 0 on node cra1080 exited on signal 9 (Killed). I was suspicious that it might be a problem with the inlet compiler and so I switched to GCC 9.3 for a fresh installation. But it seems that GCC is causing problems and I get the following error at some point: ../../SpatialDomains/libSpatialDomains.so.5.3.0: error: undefined reference to '__intel_sse2_strcpy' ../../SpatialDomains/libSpatialDomains.so.5.3.0: error: undefined reference to '_intel_fast_memmove' ../../SpatialDomains/libSpatialDomains.so.5.3.0: error: undefined reference to '__intel_sse2_strlen' ../../SpatialDomains/libSpatialDomains.so.5.3.0: error: undefined reference to '_intel_fast_memcpy' ../../SpatialDomains/libSpatialDomains.so.5.3.0: error: undefined reference to '_intel_fast_memset' collect2: error: ld returned 1 exit status make[2]: *** [library/Demos/SpatialDomains/CMakeFiles/PartitionAnalyse.dir/build.make:127: library/Demos/SpatialDomains/PartitionAnalyse] Error 1 make[1]: *** [CMakeFiles/Makefile2:3681: library/Demos/SpatialDomains/CMakeFiles/PartitionAnalyse.dir/all] Error 2 make[1]: *** Waiting for unfinished jobs.... I discussed the mesh problem with James Slaughter prior to this and he suggested that the problem might be due to a bug in a lower version (5.0.3) that I was working with. That is why I decided to go with the most recent version. After all, I am still struggling with my parallel simulations! Kind regards syavash On Thu, Mar 9, 2023 at 4:57 PM Khurana, Parv <p.khurana22@imperial.ac.uk<mailto:p.khurana22@imperial.ac.uk>> wrote: Hi Syavash, A few questions pop up on my mind on seeing this: 1. What compilers are you using (GCC or Intel?) 2. Are you loading the modules which are compatible with the compiler you are using? 3. Do you have a version of OpenBLAS or MKL already loaded as one of the modules on your cluster? As often is the case, the problem might be more detailed and it’ll be great to see the modules and cmake commands you are using for you installation in order to debug this properly. Happy to hop on a call if needed! Best Parv From: nektar-users-bounces@imperial.ac.uk<mailto:nektar-users-bounces@imperial.ac.uk> <nektar-users-bounces@imperial.ac.uk<mailto:nektar-users-bounces@imperial.ac.uk>> On Behalf Of Ehsan Asgari Sent: 09 March 2023 09:54 To: nektar-users <nektar-users@imperial.ac.uk<mailto:nektar-users@imperial.ac.uk>> Subject: [Nektar-users] Installing Nektar++ 5.2.0 on a cluster This email from eh.asgari@gmail.com<mailto:eh.asgari@gmail.com> originates from outside Imperial. Do not click on links and attachments unless you recognise the sender. If you trust the sender, add them to your safe senders list<https://spam.ic.ac.uk/SpamConsole/Senders.aspx> to disable email stamping for this address. Hi Everyone, I am trying to install the latest version of Nektar on a cluster. However, I get the following error at some point: ../../library/SpatialDomains/libSpatialDomains.so.5.3.0: error: undefined reference to '__intel_sse2_strlen' ../../library/SpatialDomains/libSpatialDomains.so.5.3.0: error: undefined reference to '_intel_fast_memcpy' ../../library/SpatialDomains/libSpatialDomains.so.5.3.0: error: undefined reference to '_intel_fast_memset' collect2: error: ld returned 1 exit status make[2]: *** [utilities/NekMesh/CMakeFiles/NekMesh.dir/build.make:106: utilities/NekMesh/NekMesh] Error 1 make[1]: *** [CMakeFiles/Makefile2:1647: utilities/NekMesh/CMakeFiles/NekMesh.dir/all] Error 2 make: *** [Makefile:141: all] Error 2 I had "NEKTAR_USE_SYSTEM_BLAS_LAPACK:BOOL=ON " and "THIRDPARTY_BUILD_BLAS_LAPACK:BOOL=ON" in the ccmake as per suggested in the user archives. I appreciate your kind help. Kind regards syavash
participants (2)
-
Ehsan Asgari
-
Slaughter, James W