Optimize solver settings for half billion mesh
Hello all, I am running a 3DH1D turbulent channel flow at Re_tau=550 (MKM590, larger domain) which has a resolution of ~0.5 billion DoF and streamwise direction is the homogeneous direction with FFT. Our target is to achieve the solver setting to have fastest computation time on Cray- XC40 machine. We have tried a plenty of options e.g. Direct Solver, Iterative solver with 1D FFT and different pre-conditioners. But our CPU Time per time step is ~5 seconds, which is quite slow as per our experience. - We compare "DirectMultiLevelStaticCond" and "DirectStaticCond" solver and finds that the first one is slightly faster than the second one. It was due to 1D decomposition of the direct solver (--npz). We cannot utilize a large number of processors (e.g. 480 processors for 960 HomModeZ). Is 2D decomposition of direct solver exist? - If we use FFTW together with the iterative solver, can I expect better performance than the direct solver? What would be the optimal choice for the solver/preconditioner combination? Would the options in PETCS help? - According to my experience in full 3D DNS case (50 mio. mesh), around 0.2 sec/time step can be expected. Although this requires a large number of cores. Can I achieve something similar with 3DH1D cases? Part of xml file: <EXPANSIONS> <E COMPOSITE="C[0]" NUMMODES="7" FIELDS="u,v,w" TYPE="MODIFIED" /> <E COMPOSITE="C[0]" NUMMODES="7" FIELDS="p" TYPE="MODIFIED" /> </EXPANSIONS> <CONDITIONS> <SOLVERINFO> <I PROPERTY="SolverType" VALUE="VelocityCorrectionScheme"/> <I PROPERTY="EQTYPE" VALUE="UnsteadyNavierStokes"/> <I PROPERTY="AdvectionForm" VALUE="Convective"/> <I PROPERTY="Projection" VALUE="Galerkin"/> <I PROPERTY="TimeIntegrationMethod" VALUE="IMEXOrder2"/> <I PROPERTY="HOMOGENEOUS" VALUE="1D"/> <I PROPERTY="DEALIASING" VALUE="ON"/> <I PROPERTY="USEFFT" VALUE="FFTW" /> <I PROPERTY="SpectralVanishingViscosity" VALUE="True"/> <I PROPERTY="SPECTRALHPDEALIASING" VALUE="True" /> <I PROPERTY="GlobalSysSoln" VALUE="DirectMultiLevelStaticCond" /> </SOLVERINFO> <PARAMETERS> <P> TimeStep = 10e-05 </P> <P> FinalTime = 10.0 </P> <P> NumSteps = FinalTime/TimeStep </P> <P> NumSteps = 1000000 </P> <P> IO_CheckSteps = 2000 </P> <P> IO_InfoSteps = 5 </P> <P> IO_CFLSteps = IO_InfoSteps </P> <P> Re = 10000 </P> <P> Kinvis = 1.0/544.0 </P> <P> HomModesZ = 960 </P> <P> LZ = 8.0*PI </P> <P> SVVCutoffRatio = 0.7 </P> <P> SVVDiffCoeff = 0.1 </P> </PARAMETERS> Thank you! Sandeep
Dear Sandeep, I think you should be using a direct parallel solver over each partition which I believe requires using XXtStatiCond (I do not recall if you can use XXtMultiLeveStaticCond). You may only require the direct solve on the pressure variable where you could also likely reduce pressure by one mode. You could then do the velocity with an iterativeStaticCond Solver possibly using a BlockDiagaonal precondtiioner. @Hui, Andrea: Could you help Sandeep set this option up. Cheers, Spencer. On 28 Jun 2018, at 18:42, Sandeep Pandey <spandey.ike@gmail.com<mailto:spandey.ike@gmail.com>> wrote: Hello all, I am running a 3DH1D turbulent channel flow at Re_tau=550 (MKM590, larger domain) which has a resolution of ~0.5 billion DoF and streamwise direction is the homogeneous direction with FFT. Our target is to achieve the solver setting to have fastest computation time on Cray- XC40 machine. We have tried a plenty of options e.g. Direct Solver, Iterative solver with 1D FFT and different pre-conditioners. But our CPU Time per time step is ~5 seconds, which is quite slow as per our experience. * We compare "DirectMultiLevelStaticCond" and "DirectStaticCond" solver and finds that the first one is slightly faster than the second one. It was due to 1D decomposition of the direct solver (--npz). We cannot utilize a large number of processors (e.g. 480 processors for 960 HomModeZ). Is 2D decomposition of direct solver exist? * If we use FFTW together with the iterative solver, can I expect better performance than the direct solver? What would be the optimal choice for the solver/preconditioner combination? Would the options in PETCS help? * According to my experience in full 3D DNS case (50 mio. mesh), around 0.2 sec/time step can be expected. Although this requires a large number of cores. Can I achieve something similar with 3DH1D cases? Part of xml file: <EXPANSIONS> <E COMPOSITE="C[0]" NUMMODES="7" FIELDS="u,v,w" TYPE="MODIFIED" /> <E COMPOSITE="C[0]" NUMMODES="7" FIELDS="p" TYPE="MODIFIED" /> </EXPANSIONS> <CONDITIONS> <SOLVERINFO> <I PROPERTY="SolverType" VALUE="VelocityCorrectionScheme"/> <I PROPERTY="EQTYPE" VALUE="UnsteadyNavierStokes"/> <I PROPERTY="AdvectionForm" VALUE="Convective"/> <I PROPERTY="Projection" VALUE="Galerkin"/> <I PROPERTY="TimeIntegrationMethod" VALUE="IMEXOrder2"/> <I PROPERTY="HOMOGENEOUS" VALUE="1D"/> <I PROPERTY="DEALIASING" VALUE="ON"/> <I PROPERTY="USEFFT" VALUE="FFTW" /> <I PROPERTY="SpectralVanishingViscosity" VALUE="True"/> <I PROPERTY="SPECTRALHPDEALIASING" VALUE="True" /> <I PROPERTY="GlobalSysSoln" VALUE="DirectMultiLevelStaticCond" /> </SOLVERINFO> <PARAMETERS> <P> TimeStep = 10e-05 </P> <P> FinalTime = 10.0 </P> <P> NumSteps = FinalTime/TimeStep </P> <P> NumSteps = 1000000 </P> <P> IO_CheckSteps = 2000 </P> <P> IO_InfoSteps = 5 </P> <P> IO_CFLSteps = IO_InfoSteps </P> <P> Re = 10000 </P> <P> Kinvis = 1.0/544.0 </P> <P> HomModesZ = 960 </P> <P> LZ = 8.0*PI </P> <P> SVVCutoffRatio = 0.7 </P> <P> SVVDiffCoeff = 0.1 </P> </PARAMETERS> Thank you! Sandeep _______________________________________________ Nektar-users mailing list Nektar-users@imperial.ac.uk<mailto:Nektar-users@imperial.ac.uk> https://mailman.ic.ac.uk/mailman/listinfo/nektar-users Spencer Sherwin FREng, FRAeS Head, Aerodynamics, Director of Research Computing Service, Professor of Computational Fluid Mechanics, Department of Aeronautics, s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> South Kensington Campus, Phone: +44 (0)20 7594 5052 Imperial College London, Fax: +44 (0)20 7594 1974 London, SW7 2AZ, UK http://www.imperial.ac.uk/people/s.sherwin/
participants (2)
- 
                
                Sandeep Pandey
- 
                
                Sherwin, Spencer J