Re: [Nektar-users] Parallel transposition not implemented yet for 3D-Homo-2D approach
Hi Douglas, thanks for the feedback. I was aware of --npz parallelization but was using a small number, not 1/2 or 1/4 of HomModesZ. Increasing npz really helped. I still have to try GlobalSysSoln. Now I face a memory problem for another case. The simulation runs out of memory when starting from a checkpoint file. Here is a little bit information about this case: - Mesh is made of around 16000 quad elements with p=5, i.e., NUMMODES="6" TYPE="MODIFIED" in xy, and HomModesZ=1080 in z direction. - I'm trying to run this case on 60 computing nodes each equipped with 24 processors, and a memory of 105 gb. In total, it makes 1440 procs, and 6300gb memory. - Execution command: mpirun -np 1440 IncNavierStokesSolver --npz 360 config.xml I was wondering if the memory usage of the application is scaling on different cores during IO, or using only one core. If it is only one core, than if it exceeds 105gb, it crushes I guess. Would you have maybe any suggestion/comment on this? Thanks, Asim On 04/13/2016 12:12 AM, Serson, Douglas wrote: Hi Asim, Concerning your questions: 1- Are you using the command line argument --npz? This is very important for obtaining an efficient parallel performance with the Fourier expansion, since it defines the number of partitions in the z-direction. If it is not set, only the xy plane will be partitioned and the parallelism will saturate quickly. I suggest initially setting npz to 1/2 or 1/4 of HomModesZ (note that nprocs must be a multiple of npz, since nprocs/npz is the number of partitions in the xy plane). Also, depending on your particular case and the number of partitions you have in the xy plane, your simulation may benefit from using a direct solver for the linear systems. This can be activated by adding '-I GlobalSysSoln=XxtMultiLevelStaticCond' to the command line. This is usually more efficient for a small number of partitions, but considering the large size of your problem it might be worth trying it. 2- I am not sure what could be causing that. I suppose it would help if you could send the exact commands you are using to run FieldConvert. Cheers, Douglas ________________________________ From: Asim Onder <ceeao@nus.edu.sg><mailto:ceeao@nus.edu.sg> Sent: 12 April 2016 06:42 To: Sherwin, Spencer J; Serson, Douglas Cc: nektar-users; Moxey, David Subject: Re: [Nektar-users] Parallel transposition not implemented yet for 3D-Homo-2D approach Dear Spencer, Douglas, Nektar-users, I'm involved now in testing of a local petascale supercomputer, and for some quite limited time I can use several thousand processors for my DNS study. My test case is oscillating flow over a rippled bed. I build up a dense unstructured grid with p=6 quadrilateral elements in x-y, and Fourier expansions in z directions. In total I have circa half billion dofs per variable. I would have a few questions about this relatively large case: 1. I noticed that scaling gets inefficient after around 500 procs, let's say parallel efficiency goes below 80%. I was wondering if you would have any general suggestions to tune the configurations for a better scaling. 2. Postprocessing vorticity and Q criterion is not working for this case. At the of the execution Fieldconvert writes some small files without the field data. What could be the reason for this? Thanks you in advance for your suggestions. Cheers, Asim On 03/21/2016 04:16 AM, Sherwin, Spencer J wrote: Hi Asim, To follow-up on Douglas’ comment we are trying to get more organised to sort out a developers guide. We are also holding a user meeting in June. If you were able to make this we could also try and have a session on getting you going on the developmental side of things. Cheers, Spencer. On 17 Mar 2016, at 14:58, Serson, Douglas <d.serson14@imperial.ac.uk<mailto:d.serson14@imperial.ac.uk>> wrote: Hi Asim, I am glad that your simulation is now working. About your questions: 1. We have some work done on a filter for calculating Reynolds stresses as the simulation progresses, but it is not ready yet, and it would not provide all the statistics you want. Since you already have a lot of chk files, I suppose the best way would indeed be using a script to process all of them with FieldConvert. 2. Yes, this has been recently included in FieldConvert, using the new 'meanmode' module. 3. I just checked that, and apparently this is caused by a bug when using this module without fftw. This should be fixed soon, but as an alternative this module should work if you switch fftw on (just add <I PROPERTY="USEFFT" VALUE="FFTW"/> to you session file, if the code was compiled with support to fftw). 4. I think there is some work towards a developer guide, but I don't how advanced is the progress on that. I am sure Spencer will be able to provide you with more information on that. Cheers, Douglas ________________________________________ From: Asim Onder <ceeao@nus.edu.sg<mailto:ceeao@nus.edu.sg>> Sent: 17 March 2016 09:10 To: Serson, Douglas; Sherwin, Spencer J Cc: nektar-users; Moxey, David Subject: Re: [Nektar-users] Parallel transposition not implemented yet for 3D-Homo-2D approach Hi Spencer, Douglas, Thanks to your suggestions I managed to get the turbulent regime for the oscillatory channel flow. I have now completed the DNS study for one case, and built up a large database with checkpoint (*chk) files. I would like to calculate turbulent statistics using this database, especially for second order terms, e.g. Reynolds stresses and turbulent dissipation, and third order terms, e.g. turbulent diffusion terms. However, I am a little bit confused how I could achieve this. I would appreciate if you could give some hints about the following: 1. The only way I could think of to calculate turbulent statistics is to write a simple bash script to iterate over chk files, and apply various existing/extended FieldConvert operations on individual chk files. This would require some additional storage to store the intermediate steps, and therefore would be a bit cumbersome. Would it be any simpler way directly doing this directly in Nektar++? 2. I have one homogeneous direction, for which I used Fourier expansions. I would like to apply spatial averaging over this homogeneous direction. Does Nektar++ already contain such functionality? 3. I want to use 'wss' in Fieldconvert module to calculate wall shear stress. However, it returns segmentation fault. Any ideas why it could be? 4. I was wondering if there is any introductory document for basic programming in Nektar++. User guide does not contain information about programming. It would be nice to have some additional information to Doxygen documentation. Thank you very much in advance for your feedback. Cheers, Asim On 02/15/2016 11:59 PM, Serson, Douglas wrote: Hi Asim, As Spencer mentioned, svv can help in stabilizing your solution. You can find information on how to set it up in the user guide (pages 92-93), but basically all you need to do is use: <I PROPERTY="SpectralVanishingViscosity" VALUE="True"/> You can also tune it by setting the parameters SVVCutoffRatio and SVVDiffCoeff, but I would suggest starting with the default parameters. Also, you can use the parameter IO_CFLSteps to output the CFL number. This way you can check if the time step you are using is appropriate. Cheers, Douglas From: Sherwin, Spencer J Sent: 14 February 2016 19:46 To: ceeao Cc: nektar-users; Serson, Douglas; Moxey, David Subject: Re: [Nektar-users] Parallel transposition not implemented yet for 3D-Homo-2D approach Hi Asim, Getting a flow through transition is very challenging since there is a strong localisation of shear and this can lead to aliasing issues which can then cause instabilities. Both Douglas and Dave have experienced this with recent simulations so I am cc’ing them to make some suggestions. I would be inclined to be using spectralhpdealiasing and svv. Hopefully Douglas can send you an example of how to switch this on. Cheers, Spencer. On 11 Feb 2016, at 10:32, ceeao<<mailto:ceeao@nus.edu.sg>ceeao@nus.edu.sg<mailto:ceeao@nus.edu.sg>> wrote: Hi Spencer, Nektar-Users, I followed the suggestion and coarsened the grid a bit. This way it worked impressively fast, but the flow is stable and remains laminar, as I didn't add any perturbations. I need to kick the transition to have turbulence. If I add white noise, even very low magnitude, conjugate gradient solver blows up again. I also tried adding some sinusoidal perturbations to boundary conditions, and again had troubles with CG. I don't really get CG's extreme sensitivity to perturbations. Any suggestion is much appreciated. Thanks in advance. Cheers, Asim On 02/08/2016 04:48 PM, Sherwin, Spencer J wrote: HI Asim, How many parallel cores are you running on. Sometime starting up these flows can be tricky especially if you are immediately jumping to a high Reynolds number. Have you tried first starting the flow at a Lower Reynolds number? Also 100 x 200 is quite a few elements in the x-y plane. Remember the polynomial order adds in more points on top of the mesh discretisation. I would perhaps recommend trying a smaller mesh to see how that goes first. Actually I note there is a file called TurbChFl_3D1H.xml in the ~/Nektar/Solvers/IncNavierStokesSolver/Examples directory which might be worth looking at. I think this was a mesh used in Ale Bolis’ thesis which you can find under: http://wwwf.imperial.ac.uk/ssherw/spectralhp/papers/PhDThesis/Bolis_Thesis.p... Cheers, Spencer. On 1 Feb 2016, at 07:01, ceeao<mailto:ceeao@nus.edu.sg><ceeao@nus.edu.sg><mailto:ceeao@nus.edu.sg> wrote: Hi Spencer, Thank you for the quick reply and suggestion. I switched indeed to 3D homo 1D case and this time I have problems with the divergence of linear solvers. I refined the grid in the channel flow example to 100x200x64 in x-y-z directions, and left everything else the same. When I employ the default global system solver "IterativeStaticCond" with this setup, I get divergence: "Exceeded maximum number of iterations (5000)". I checked the initial fields and mesh in Paraview, everything seems to be normal. I also tried the "LowEnergyBlock" preconditioner, and apparently this one is valid only in sheer 3D cases. My knowledge in iterative solvers for hp-Fem is minimal. Therefore, I was wondering if you could suggest maybe a robust option that at least converge. My concern is getting some rough estimates for the speed of Nektar++ in my oscillating channel flow problem. If the speed will be promising, I will switch to Nektar++ from OpenFOAM, as OpenFOAM is low-order and not really suitable for DNS. Thanks again in advance. Cheers, Asim On 01/31/2016 11:53 PM, Sherwin, Spencer J wrote: Hi Asim, I think your conclusions is correct. We did some early implementation into the 2D Homogeneous expansion but have not pulled it all the way through since we did not have a full project on this topic. We have however kept the existing code running through our regression test. For now I would perhaps suggest you try the 3D homo 1D approach for your runs since you can use parallelisation in that code. Cheers, Spencer. On 29 Jan 2016, at 04:00, ceeao<mailto:ceeao@nus.edu.sg><ceeao@nus.edu.sg><mailto:ceeao@nus.edu.sg> wrote: Dear all, I just installed the library, and need to simulate DNS of a channel flow with oscillating pressure gradient. As I have two homogeneous directions I applied standard Fourier discretization in these directions. It seems like this case is not parallelized yet, and I got the error in the subject. I was wondering if I'm overlooking something. If not, are there maybe any plans in the future to include parallelization of 2D FFT's? Thank you in advance. Best, Asim Onder Research Fellow National University of Singapore ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. _______________________________________________ Nektar-users mailing list Nektar-users@imperial.ac.uk<mailto:Nektar-users@imperial.ac.uk> https://mailman.ic.ac.uk/mailman/listinfo/nektar-users Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ <mailto:s.sherwin@imperial.ac.uk>s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you.
Hi Asim, In fully 3D simulations we tend to pre-partition the mesh and this can help with memory usage on a single core. To do this you can run the solver with the option - - part-only=’no of partitions of 2D planes’ Then instead of running with a file.xml you give the solver file_xml directory. However I am not sure whether this is all working with the 2.3 D code. Douglas is this how you start any of your runs? Cheers, Spencer. On 20 Apr 2016, at 05:48, Asim Onder <ceeao@nus.edu.sg<mailto:ceeao@nus.edu.sg>> wrote: Hi Douglas, thanks for the feedback. I was aware of --npz parallelization but was using a small number, not 1/2 or 1/4 of HomModesZ. Increasing npz really helped. I still have to try GlobalSysSoln. Now I face a memory problem for another case. The simulation runs out of memory when starting from a checkpoint file. Here is a little bit information about this case: - Mesh is made of around 16000 quad elements with p=5, i.e., NUMMODES="6" TYPE="MODIFIED" in xy, and HomModesZ=1080 in z direction. - I'm trying to run this case on 60 computing nodes each equipped with 24 processors, and a memory of 105 gb. In total, it makes 1440 procs, and 6300gb memory. - Execution command: mpirun -np 1440 IncNavierStokesSolver --npz 360 config.xml I was wondering if the memory usage of the application is scaling on different cores during IO, or using only one core. If it is only one core, than if it exceeds 105gb, it crushes I guess. Would you have maybe any suggestion/comment on this? Thanks, Asim On 04/13/2016 12:12 AM, Serson, Douglas wrote: Hi Asim, Concerning your questions: 1- Are you using the command line argument --npz? This is very important for obtaining an efficient parallel performance with the Fourier expansion, since it defines the number of partitions in the z-direction. If it is not set, only the xy plane will be partitioned and the parallelism will saturate quickly. I suggest initially setting npz to 1/2 or 1/4 of HomModesZ (note that nprocs must be a multiple of npz, since nprocs/npz is the number of partitions in the xy plane). Also, depending on your particular case and the number of partitions you have in the xy plane, your simulation may benefit from using a direct solver for the linear systems. This can be activated by adding '-I GlobalSysSoln=XxtMultiLevelStaticCond' to the command line. This is usually more efficient for a small number of partitions, but considering the large size of your problem it might be worth trying it. 2- I am not sure what could be causing that. I suppose it would help if you could send the exact commands you are using to run FieldConvert. Cheers, Douglas ________________________________ From: Asim Onder <ceeao@nus.edu.sg><mailto:ceeao@nus.edu.sg> Sent: 12 April 2016 06:42 To: Sherwin, Spencer J; Serson, Douglas Cc: nektar-users; Moxey, David Subject: Re: [Nektar-users] Parallel transposition not implemented yet for 3D-Homo-2D approach Dear Spencer, Douglas, Nektar-users, I'm involved now in testing of a local petascale supercomputer, and for some quite limited time I can use several thousand processors for my DNS study. My test case is oscillating flow over a rippled bed. I build up a dense unstructured grid with p=6 quadrilateral elements in x-y, and Fourier expansions in z directions. In total I have circa half billion dofs per variable. I would have a few questions about this relatively large case: 1. I noticed that scaling gets inefficient after around 500 procs, let's say parallel efficiency goes below 80%. I was wondering if you would have any general suggestions to tune the configurations for a better scaling. 2. Postprocessing vorticity and Q criterion is not working for this case. At the of the execution Fieldconvert writes some small files without the field data. What could be the reason for this? Thanks you in advance for your suggestions. Cheers, Asim On 03/21/2016 04:16 AM, Sherwin, Spencer J wrote: Hi Asim, To follow-up on Douglas’ comment we are trying to get more organised to sort out a developers guide. We are also holding a user meeting in June. If you were able to make this we could also try and have a session on getting you going on the developmental side of things. Cheers, Spencer. On 17 Mar 2016, at 14:58, Serson, Douglas <d.serson14@imperial.ac.uk<mailto:d.serson14@imperial.ac.uk>> wrote: Hi Asim, I am glad that your simulation is now working. About your questions: 1. We have some work done on a filter for calculating Reynolds stresses as the simulation progresses, but it is not ready yet, and it would not provide all the statistics you want. Since you already have a lot of chk files, I suppose the best way would indeed be using a script to process all of them with FieldConvert. 2. Yes, this has been recently included in FieldConvert, using the new 'meanmode' module. 3. I just checked that, and apparently this is caused by a bug when using this module without fftw. This should be fixed soon, but as an alternative this module should work if you switch fftw on (just add <I PROPERTY="USEFFT" VALUE="FFTW"/> to you session file, if the code was compiled with support to fftw). 4. I think there is some work towards a developer guide, but I don't how advanced is the progress on that. I am sure Spencer will be able to provide you with more information on that. Cheers, Douglas ________________________________________ From: Asim Onder <ceeao@nus.edu.sg<mailto:ceeao@nus.edu.sg>> Sent: 17 March 2016 09:10 To: Serson, Douglas; Sherwin, Spencer J Cc: nektar-users; Moxey, David Subject: Re: [Nektar-users] Parallel transposition not implemented yet for 3D-Homo-2D approach Hi Spencer, Douglas, Thanks to your suggestions I managed to get the turbulent regime for the oscillatory channel flow. I have now completed the DNS study for one case, and built up a large database with checkpoint (*chk) files. I would like to calculate turbulent statistics using this database, especially for second order terms, e.g. Reynolds stresses and turbulent dissipation, and third order terms, e.g. turbulent diffusion terms. However, I am a little bit confused how I could achieve this. I would appreciate if you could give some hints about the following: 1. The only way I could think of to calculate turbulent statistics is to write a simple bash script to iterate over chk files, and apply various existing/extended FieldConvert operations on individual chk files. This would require some additional storage to store the intermediate steps, and therefore would be a bit cumbersome. Would it be any simpler way directly doing this directly in Nektar++? 2. I have one homogeneous direction, for which I used Fourier expansions. I would like to apply spatial averaging over this homogeneous direction. Does Nektar++ already contain such functionality? 3. I want to use 'wss' in Fieldconvert module to calculate wall shear stress. However, it returns segmentation fault. Any ideas why it could be? 4. I was wondering if there is any introductory document for basic programming in Nektar++. User guide does not contain information about programming. It would be nice to have some additional information to Doxygen documentation. Thank you very much in advance for your feedback. Cheers, Asim On 02/15/2016 11:59 PM, Serson, Douglas wrote: Hi Asim, As Spencer mentioned, svv can help in stabilizing your solution. You can find information on how to set it up in the user guide (pages 92-93), but basically all you need to do is use: <I PROPERTY="SpectralVanishingViscosity" VALUE="True"/> You can also tune it by setting the parameters SVVCutoffRatio and SVVDiffCoeff, but I would suggest starting with the default parameters. Also, you can use the parameter IO_CFLSteps to output the CFL number. This way you can check if the time step you are using is appropriate. Cheers, Douglas From: Sherwin, Spencer J Sent: 14 February 2016 19:46 To: ceeao Cc: nektar-users; Serson, Douglas; Moxey, David Subject: Re: [Nektar-users] Parallel transposition not implemented yet for 3D-Homo-2D approach Hi Asim, Getting a flow through transition is very challenging since there is a strong localisation of shear and this can lead to aliasing issues which can then cause instabilities. Both Douglas and Dave have experienced this with recent simulations so I am cc’ing them to make some suggestions. I would be inclined to be using spectralhpdealiasing and svv. Hopefully Douglas can send you an example of how to switch this on. Cheers, Spencer. On 11 Feb 2016, at 10:32, ceeao<<mailto:ceeao@nus.edu.sg>ceeao@nus.edu.sg<mailto:ceeao@nus.edu.sg>> wrote: Hi Spencer, Nektar-Users, I followed the suggestion and coarsened the grid a bit. This way it worked impressively fast, but the flow is stable and remains laminar, as I didn't add any perturbations. I need to kick the transition to have turbulence. If I add white noise, even very low magnitude, conjugate gradient solver blows up again. I also tried adding some sinusoidal perturbations to boundary conditions, and again had troubles with CG. I don't really get CG's extreme sensitivity to perturbations. Any suggestion is much appreciated. Thanks in advance. Cheers, Asim On 02/08/2016 04:48 PM, Sherwin, Spencer J wrote: HI Asim, How many parallel cores are you running on. Sometime starting up these flows can be tricky especially if you are immediately jumping to a high Reynolds number. Have you tried first starting the flow at a Lower Reynolds number? Also 100 x 200 is quite a few elements in the x-y plane. Remember the polynomial order adds in more points on top of the mesh discretisation. I would perhaps recommend trying a smaller mesh to see how that goes first. Actually I note there is a file called TurbChFl_3D1H.xml in the ~/Nektar/Solvers/IncNavierStokesSolver/Examples directory which might be worth looking at. I think this was a mesh used in Ale Bolis’ thesis which you can find under: http://wwwf.imperial.ac.uk/ssherw/spectralhp/papers/PhDThesis/Bolis_Thesis.p... Cheers, Spencer. On 1 Feb 2016, at 07:01, ceeao<mailto:ceeao@nus.edu.sg><ceeao@nus.edu.sg><mailto:ceeao@nus.edu.sg> wrote: Hi Spencer, Thank you for the quick reply and suggestion. I switched indeed to 3D homo 1D case and this time I have problems with the divergence of linear solvers. I refined the grid in the channel flow example to 100x200x64 in x-y-z directions, and left everything else the same. When I employ the default global system solver "IterativeStaticCond" with this setup, I get divergence: "Exceeded maximum number of iterations (5000)". I checked the initial fields and mesh in Paraview, everything seems to be normal. I also tried the "LowEnergyBlock" preconditioner, and apparently this one is valid only in sheer 3D cases. My knowledge in iterative solvers for hp-Fem is minimal. Therefore, I was wondering if you could suggest maybe a robust option that at least converge. My concern is getting some rough estimates for the speed of Nektar++ in my oscillating channel flow problem. If the speed will be promising, I will switch to Nektar++ from OpenFOAM, as OpenFOAM is low-order and not really suitable for DNS. Thanks again in advance. Cheers, Asim On 01/31/2016 11:53 PM, Sherwin, Spencer J wrote: Hi Asim, I think your conclusions is correct. We did some early implementation into the 2D Homogeneous expansion but have not pulled it all the way through since we did not have a full project on this topic. We have however kept the existing code running through our regression test. For now I would perhaps suggest you try the 3D homo 1D approach for your runs since you can use parallelisation in that code. Cheers, Spencer. On 29 Jan 2016, at 04:00, ceeao<mailto:ceeao@nus.edu.sg><ceeao@nus.edu.sg><mailto:ceeao@nus.edu.sg> wrote: Dear all, I just installed the library, and need to simulate DNS of a channel flow with oscillating pressure gradient. As I have two homogeneous directions I applied standard Fourier discretization in these directions. It seems like this case is not parallelized yet, and I got the error in the subject. I was wondering if I'm overlooking something. If not, are there maybe any plans in the future to include parallelization of 2D FFT's? Thank you in advance. Best, Asim Onder Research Fellow National University of Singapore ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. _______________________________________________ Nektar-users mailing list Nektar-users@imperial.ac.uk<mailto:Nektar-users@imperial.ac.uk> https://mailman.ic.ac.uk/mailman/listinfo/nektar-users Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ <mailto:s.sherwin@imperial.ac.uk>s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052
Hi Asim, One thing I noticed about your setup is that HomModesZ / npz = 3. This should always be an even number, so you will need to change your parameters (for example using npz = 180). I am surprised no error message with this information was displayed, but this will definitely make your simulation crash. In terms of IO, as Spencer said you can pre-partition the mesh. However, I don't think this will make much difference since your mesh is 2D, and therefore does not use much memory anyway. As for the checkpoint file, as far as I know each process only tries to load one file at a time. If your checkpoint was obtained from a simulation with many cores, each file will be relatively small, and you should not have any problems. Cheers, Douglas ________________________________ From: Sherwin, Spencer J Sent: 21 April 2016 19:34 To: Asim Onder Cc: Serson, Douglas; nektar-users; Moxey, David Subject: Re: [Nektar-users] Parallel transposition not implemented yet for 3D-Homo-2D approach Hi Asim, In fully 3D simulations we tend to pre-partition the mesh and this can help with memory usage on a single core. To do this you can run the solver with the option - - part-only=’no of partitions of 2D planes’ Then instead of running with a file.xml you give the solver file_xml directory. However I am not sure whether this is all working with the 2.3 D code. Douglas is this how you start any of your runs? Cheers, Spencer. On 20 Apr 2016, at 05:48, Asim Onder <ceeao@nus.edu.sg<mailto:ceeao@nus.edu.sg>> wrote: Hi Douglas, thanks for the feedback. I was aware of --npz parallelization but was using a small number, not 1/2 or 1/4 of HomModesZ. Increasing npz really helped. I still have to try GlobalSysSoln. Now I face a memory problem for another case. The simulation runs out of memory when starting from a checkpoint file. Here is a little bit information about this case: - Mesh is made of around 16000 quad elements with p=5, i.e., NUMMODES="6" TYPE="MODIFIED" in xy, and HomModesZ=1080 in z direction. - I'm trying to run this case on 60 computing nodes each equipped with 24 processors, and a memory of 105 gb. In total, it makes 1440 procs, and 6300gb memory. - Execution command: mpirun -np 1440 IncNavierStokesSolver --npz 360 config.xml I was wondering if the memory usage of the application is scaling on different cores during IO, or using only one core. If it is only one core, than if it exceeds 105gb, it crushes I guess. Would you have maybe any suggestion/comment on this? Thanks, Asim On 04/13/2016 12:12 AM, Serson, Douglas wrote: Hi Asim, Concerning your questions: 1- Are you using the command line argument --npz? This is very important for obtaining an efficient parallel performance with the Fourier expansion, since it defines the number of partitions in the z-direction. If it is not set, only the xy plane will be partitioned and the parallelism will saturate quickly. I suggest initially setting npz to 1/2 or 1/4 of HomModesZ (note that nprocs must be a multiple of npz, since nprocs/npz is the number of partitions in the xy plane). Also, depending on your particular case and the number of partitions you have in the xy plane, your simulation may benefit from using a direct solver for the linear systems. This can be activated by adding '-I GlobalSysSoln=XxtMultiLevelStaticCond' to the command line. This is usually more efficient for a small number of partitions, but considering the large size of your problem it might be worth trying it. 2- I am not sure what could be causing that. I suppose it would help if you could send the exact commands you are using to run FieldConvert. Cheers, Douglas ________________________________ From: Asim Onder <ceeao@nus.edu.sg><mailto:ceeao@nus.edu.sg> Sent: 12 April 2016 06:42 To: Sherwin, Spencer J; Serson, Douglas Cc: nektar-users; Moxey, David Subject: Re: [Nektar-users] Parallel transposition not implemented yet for 3D-Homo-2D approach Dear Spencer, Douglas, Nektar-users, I'm involved now in testing of a local petascale supercomputer, and for some quite limited time I can use several thousand processors for my DNS study. My test case is oscillating flow over a rippled bed. I build up a dense unstructured grid with p=6 quadrilateral elements in x-y, and Fourier expansions in z directions. In total I have circa half billion dofs per variable. I would have a few questions about this relatively large case: 1. I noticed that scaling gets inefficient after around 500 procs, let's say parallel efficiency goes below 80%. I was wondering if you would have any general suggestions to tune the configurations for a better scaling. 2. Postprocessing vorticity and Q criterion is not working for this case. At the of the execution Fieldconvert writes some small files without the field data. What could be the reason for this? Thanks you in advance for your suggestions. Cheers, Asim On 03/21/2016 04:16 AM, Sherwin, Spencer J wrote: Hi Asim, To follow-up on Douglas’ comment we are trying to get more organised to sort out a developers guide. We are also holding a user meeting in June. If you were able to make this we could also try and have a session on getting you going on the developmental side of things. Cheers, Spencer. On 17 Mar 2016, at 14:58, Serson, Douglas <d.serson14@imperial.ac.uk<mailto:d.serson14@imperial.ac.uk>> wrote: Hi Asim, I am glad that your simulation is now working. About your questions: 1. We have some work done on a filter for calculating Reynolds stresses as the simulation progresses, but it is not ready yet, and it would not provide all the statistics you want. Since you already have a lot of chk files, I suppose the best way would indeed be using a script to process all of them with FieldConvert. 2. Yes, this has been recently included in FieldConvert, using the new 'meanmode' module. 3. I just checked that, and apparently this is caused by a bug when using this module without fftw. This should be fixed soon, but as an alternative this module should work if you switch fftw on (just add <I PROPERTY="USEFFT" VALUE="FFTW"/> to you session file, if the code was compiled with support to fftw). 4. I think there is some work towards a developer guide, but I don't how advanced is the progress on that. I am sure Spencer will be able to provide you with more information on that. Cheers, Douglas ________________________________________ From: Asim Onder <ceeao@nus.edu.sg<mailto:ceeao@nus.edu.sg>> Sent: 17 March 2016 09:10 To: Serson, Douglas; Sherwin, Spencer J Cc: nektar-users; Moxey, David Subject: Re: [Nektar-users] Parallel transposition not implemented yet for 3D-Homo-2D approach Hi Spencer, Douglas, Thanks to your suggestions I managed to get the turbulent regime for the oscillatory channel flow. I have now completed the DNS study for one case, and built up a large database with checkpoint (*chk) files. I would like to calculate turbulent statistics using this database, especially for second order terms, e.g. Reynolds stresses and turbulent dissipation, and third order terms, e.g. turbulent diffusion terms. However, I am a little bit confused how I could achieve this. I would appreciate if you could give some hints about the following: 1. The only way I could think of to calculate turbulent statistics is to write a simple bash script to iterate over chk files, and apply various existing/extended FieldConvert operations on individual chk files. This would require some additional storage to store the intermediate steps, and therefore would be a bit cumbersome. Would it be any simpler way directly doing this directly in Nektar++? 2. I have one homogeneous direction, for which I used Fourier expansions. I would like to apply spatial averaging over this homogeneous direction. Does Nektar++ already contain such functionality? 3. I want to use 'wss' in Fieldconvert module to calculate wall shear stress. However, it returns segmentation fault. Any ideas why it could be? 4. I was wondering if there is any introductory document for basic programming in Nektar++. User guide does not contain information about programming. It would be nice to have some additional information to Doxygen documentation. Thank you very much in advance for your feedback. Cheers, Asim On 02/15/2016 11:59 PM, Serson, Douglas wrote: Hi Asim, As Spencer mentioned, svv can help in stabilizing your solution. You can find information on how to set it up in the user guide (pages 92-93), but basically all you need to do is use: <I PROPERTY="SpectralVanishingViscosity" VALUE="True"/> You can also tune it by setting the parameters SVVCutoffRatio and SVVDiffCoeff, but I would suggest starting with the default parameters. Also, you can use the parameter IO_CFLSteps to output the CFL number. This way you can check if the time step you are using is appropriate. Cheers, Douglas From: Sherwin, Spencer J Sent: 14 February 2016 19:46 To: ceeao Cc: nektar-users; Serson, Douglas; Moxey, David Subject: Re: [Nektar-users] Parallel transposition not implemented yet for 3D-Homo-2D approach Hi Asim, Getting a flow through transition is very challenging since there is a strong localisation of shear and this can lead to aliasing issues which can then cause instabilities. Both Douglas and Dave have experienced this with recent simulations so I am cc’ing them to make some suggestions. I would be inclined to be using spectralhpdealiasing and svv. Hopefully Douglas can send you an example of how to switch this on. Cheers, Spencer. On 11 Feb 2016, at 10:32, ceeao<<mailto:ceeao@nus.edu.sg>ceeao@nus.edu.sg<mailto:ceeao@nus.edu.sg>> wrote: Hi Spencer, Nektar-Users, I followed the suggestion and coarsened the grid a bit. This way it worked impressively fast, but the flow is stable and remains laminar, as I didn't add any perturbations. I need to kick the transition to have turbulence. If I add white noise, even very low magnitude, conjugate gradient solver blows up again. I also tried adding some sinusoidal perturbations to boundary conditions, and again had troubles with CG. I don't really get CG's extreme sensitivity to perturbations. Any suggestion is much appreciated. Thanks in advance. Cheers, Asim On 02/08/2016 04:48 PM, Sherwin, Spencer J wrote: HI Asim, How many parallel cores are you running on. Sometime starting up these flows can be tricky especially if you are immediately jumping to a high Reynolds number. Have you tried first starting the flow at a Lower Reynolds number? Also 100 x 200 is quite a few elements in the x-y plane. Remember the polynomial order adds in more points on top of the mesh discretisation. I would perhaps recommend trying a smaller mesh to see how that goes first. Actually I note there is a file called TurbChFl_3D1H.xml in the ~/Nektar/Solvers/IncNavierStokesSolver/Examples directory which might be worth looking at. I think this was a mesh used in Ale Bolis’ thesis which you can find under: http://wwwf.imperial.ac.uk/ssherw/spectralhp/papers/PhDThesis/Bolis_Thesis.p... Cheers, Spencer. On 1 Feb 2016, at 07:01, ceeao<mailto:ceeao@nus.edu.sg><ceeao@nus.edu.sg><mailto:ceeao@nus.edu.sg> wrote: Hi Spencer, Thank you for the quick reply and suggestion. I switched indeed to 3D homo 1D case and this time I have problems with the divergence of linear solvers. I refined the grid in the channel flow example to 100x200x64 in x-y-z directions, and left everything else the same. When I employ the default global system solver "IterativeStaticCond" with this setup, I get divergence: "Exceeded maximum number of iterations (5000)". I checked the initial fields and mesh in Paraview, everything seems to be normal. I also tried the "LowEnergyBlock" preconditioner, and apparently this one is valid only in sheer 3D cases. My knowledge in iterative solvers for hp-Fem is minimal. Therefore, I was wondering if you could suggest maybe a robust option that at least converge. My concern is getting some rough estimates for the speed of Nektar++ in my oscillating channel flow problem. If the speed will be promising, I will switch to Nektar++ from OpenFOAM, as OpenFOAM is low-order and not really suitable for DNS. Thanks again in advance. Cheers, Asim On 01/31/2016 11:53 PM, Sherwin, Spencer J wrote: Hi Asim, I think your conclusions is correct. We did some early implementation into the 2D Homogeneous expansion but have not pulled it all the way through since we did not have a full project on this topic. We have however kept the existing code running through our regression test. For now I would perhaps suggest you try the 3D homo 1D approach for your runs since you can use parallelisation in that code. Cheers, Spencer. On 29 Jan 2016, at 04:00, ceeao<mailto:ceeao@nus.edu.sg><ceeao@nus.edu.sg><mailto:ceeao@nus.edu.sg> wrote: Dear all, I just installed the library, and need to simulate DNS of a channel flow with oscillating pressure gradient. As I have two homogeneous directions I applied standard Fourier discretization in these directions. It seems like this case is not parallelized yet, and I got the error in the subject. I was wondering if I'm overlooking something. If not, are there maybe any plans in the future to include parallelization of 2D FFT's? Thank you in advance. Best, Asim Onder Research Fellow National University of Singapore ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. _______________________________________________ Nektar-users mailing list Nektar-users@imperial.ac.uk<mailto:Nektar-users@imperial.ac.uk> https://mailman.ic.ac.uk/mailman/listinfo/nektar-users Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ <mailto:s.sherwin@imperial.ac.uk>s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052
Hi Douglas, Spencer, Thanks for the suggestions, the problem is gone. I'm now a little concerned about the postprocessing of this relatively big case. For example, calculating vorticity from a snapshot in a chk folder takes several hours if I use a command like this: mpirun -np 720 FieldConvert -m vorticity config.xml config_10.chk vorticity_10.chk Changing the #procs didn't help too much. If I try to process individual domains one by one with something like this: FieldConvert --nprocs 72 --procid 1 -m vorticity config.xml config_10.chk vorticity_10.vtu It still seem to take hours. Just for a comparison: for this case, one time step of IncNavierStokesSolver takes around 5 seconds on 1440 procs with an initialization time of around 5mins. I guess I'm doing something wrong. Would you have any suggestions on this? Thank a lot in advance. Cheers, Asim On 04/22/2016 03:22 AM, Serson, Douglas wrote: Hi Asim, One thing I noticed about your setup is that HomModesZ / npz = 3. This should always be an even number, so you will need to change your parameters (for example using npz = 180). I am surprised no error message with this information was displayed, but this will definitely make your simulation crash. In terms of IO, as Spencer said you can pre-partition the mesh. However, I don't think this will make much difference since your mesh is 2D, and therefore does not use much memory anyway. As for the checkpoint file, as far as I know each process only tries to load one file at a time. If your checkpoint was obtained from a simulation with many cores, each file will be relatively small, and you should not have any problems. Cheers, Douglas ________________________________ From: Sherwin, Spencer J Sent: 21 April 2016 19:34 To: Asim Onder Cc: Serson, Douglas; nektar-users; Moxey, David Subject: Re: [Nektar-users] Parallel transposition not implemented yet for 3D-Homo-2D approach Hi Asim, In fully 3D simulations we tend to pre-partition the mesh and this can help with memory usage on a single core. To do this you can run the solver with the option - - part-only=’no of partitions of 2D planes’ Then instead of running with a file.xml you give the solver file_xml directory. However I am not sure whether this is all working with the 2.3 D code. Douglas is this how you start any of your runs? Cheers, Spencer. On 20 Apr 2016, at 05:48, Asim Onder <ceeao@nus.edu.sg<mailto:ceeao@nus.edu.sg>> wrote: Hi Douglas, thanks for the feedback. I was aware of --npz parallelization but was using a small number, not 1/2 or 1/4 of HomModesZ. Increasing npz really helped. I still have to try GlobalSysSoln. Now I face a memory problem for another case. The simulation runs out of memory when starting from a checkpoint file. Here is a little bit information about this case: - Mesh is made of around 16000 quad elements with p=5, i.e., NUMMODES="6" TYPE="MODIFIED" in xy, and HomModesZ=1080 in z direction. - I'm trying to run this case on 60 computing nodes each equipped with 24 processors, and a memory of 105 gb. In total, it makes 1440 procs, and 6300gb memory. - Execution command: mpirun -np 1440 IncNavierStokesSolver --npz 360 config.xml I was wondering if the memory usage of the application is scaling on different cores during IO, or using only one core. If it is only one core, than if it exceeds 105gb, it crushes I guess. Would you have maybe any suggestion/comment on this? Thanks, Asim On 04/13/2016 12:12 AM, Serson, Douglas wrote: Hi Asim, Concerning your questions: 1- Are you using the command line argument --npz? This is very important for obtaining an efficient parallel performance with the Fourier expansion, since it defines the number of partitions in the z-direction. If it is not set, only the xy plane will be partitioned and the parallelism will saturate quickly. I suggest initially setting npz to 1/2 or 1/4 of HomModesZ (note that nprocs must be a multiple of npz, since nprocs/npz is the number of partitions in the xy plane). Also, depending on your particular case and the number of partitions you have in the xy plane, your simulation may benefit from using a direct solver for the linear systems. This can be activated by adding '-I GlobalSysSoln=XxtMultiLevelStaticCond' to the command line. This is usually more efficient for a small number of partitions, but considering the large size of your problem it might be worth trying it. 2- I am not sure what could be causing that. I suppose it would help if you could send the exact commands you are using to run FieldConvert. Cheers, Douglas ________________________________ From: Asim Onder <mailto:ceeao@nus.edu.sg> <ceeao@nus.edu.sg><mailto:ceeao@nus.edu.sg> Sent: 12 April 2016 06:42 To: Sherwin, Spencer J; Serson, Douglas Cc: nektar-users; Moxey, David Subject: Re: [Nektar-users] Parallel transposition not implemented yet for 3D-Homo-2D approach Dear Spencer, Douglas, Nektar-users, I'm involved now in testing of a local petascale supercomputer, and for some quite limited time I can use several thousand processors for my DNS study. My test case is oscillating flow over a rippled bed. I build up a dense unstructured grid with p=6 quadrilateral elements in x-y, and Fourier expansions in z directions. In total I have circa half billion dofs per variable. I would have a few questions about this relatively large case: 1. I noticed that scaling gets inefficient after around 500 procs, let's say parallel efficiency goes below 80%. I was wondering if you would have any general suggestions to tune the configurations for a better scaling. 2. Postprocessing vorticity and Q criterion is not working for this case. At the of the execution Fieldconvert writes some small files without the field data. What could be the reason for this? Thanks you in advance for your suggestions. Cheers, Asim On 03/21/2016 04:16 AM, Sherwin, Spencer J wrote: Hi Asim, To follow-up on Douglas’ comment we are trying to get more organised to sort out a developers guide. We are also holding a user meeting in June. If you were able to make this we could also try and have a session on getting you going on the developmental side of things. Cheers, Spencer. On 17 Mar 2016, at 14:58, Serson, Douglas <<mailto:d.serson14@imperial.ac.uk>d.serson14@imperial.ac.uk<mailto:d.serson14@imperial.ac.uk>> wrote: Hi Asim, I am glad that your simulation is now working. About your questions: 1. We have some work done on a filter for calculating Reynolds stresses as the simulation progresses, but it is not ready yet, and it would not provide all the statistics you want. Since you already have a lot of chk files, I suppose the best way would indeed be using a script to process all of them with FieldConvert. 2. Yes, this has been recently included in FieldConvert, using the new 'meanmode' module. 3. I just checked that, and apparently this is caused by a bug when using this module without fftw. This should be fixed soon, but as an alternative this module should work if you switch fftw on (just add <I PROPERTY="USEFFT" VALUE="FFTW"/> to you session file, if the code was compiled with support to fftw). 4. I think there is some work towards a developer guide, but I don't how advanced is the progress on that. I am sure Spencer will be able to provide you with more information on that. Cheers, Douglas ________________________________________ From: Asim Onder <<mailto:ceeao@nus.edu.sg>ceeao@nus.edu.sg<mailto:ceeao@nus.edu.sg>> Sent: 17 March 2016 09:10 To: Serson, Douglas; Sherwin, Spencer J Cc: nektar-users; Moxey, David Subject: Re: [Nektar-users] Parallel transposition not implemented yet for 3D-Homo-2D approach Hi Spencer, Douglas, Thanks to your suggestions I managed to get the turbulent regime for the oscillatory channel flow. I have now completed the DNS study for one case, and built up a large database with checkpoint (*chk) files. I would like to calculate turbulent statistics using this database, especially for second order terms, e.g. Reynolds stresses and turbulent dissipation, and third order terms, e.g. turbulent diffusion terms. However, I am a little bit confused how I could achieve this. I would appreciate if you could give some hints about the following: 1. The only way I could think of to calculate turbulent statistics is to write a simple bash script to iterate over chk files, and apply various existing/extended FieldConvert operations on individual chk files. This would require some additional storage to store the intermediate steps, and therefore would be a bit cumbersome. Would it be any simpler way directly doing this directly in Nektar++? 2. I have one homogeneous direction, for which I used Fourier expansions. I would like to apply spatial averaging over this homogeneous direction. Does Nektar++ already contain such functionality? 3. I want to use 'wss' in Fieldconvert module to calculate wall shear stress. However, it returns segmentation fault. Any ideas why it could be? 4. I was wondering if there is any introductory document for basic programming in Nektar++. User guide does not contain information about programming. It would be nice to have some additional information to Doxygen documentation. Thank you very much in advance for your feedback. Cheers, Asim On 02/15/2016 11:59 PM, Serson, Douglas wrote: Hi Asim, As Spencer mentioned, svv can help in stabilizing your solution. You can find information on how to set it up in the user guide (pages 92-93), but basically all you need to do is use: <I PROPERTY="SpectralVanishingViscosity" VALUE="True"/> You can also tune it by setting the parameters SVVCutoffRatio and SVVDiffCoeff, but I would suggest starting with the default parameters. Also, you can use the parameter IO_CFLSteps to output the CFL number. This way you can check if the time step you are using is appropriate. Cheers, Douglas From: Sherwin, Spencer J Sent: 14 February 2016 19:46 To: ceeao Cc: nektar-users; Serson, Douglas; Moxey, David Subject: Re: [Nektar-users] Parallel transposition not implemented yet for 3D-Homo-2D approach Hi Asim, Getting a flow through transition is very challenging since there is a strong localisation of shear and this can lead to aliasing issues which can then cause instabilities. Both Douglas and Dave have experienced this with recent simulations so I am cc’ing them to make some suggestions. I would be inclined to be using spectralhpdealiasing and svv. Hopefully Douglas can send you an example of how to switch this on. Cheers, Spencer. On 11 Feb 2016, at 10:32, ceeao<<mailto:ceeao@nus.edu.sg>ceeao@nus.edu.sg<mailto:ceeao@nus.edu.sg>> wrote: Hi Spencer, Nektar-Users, I followed the suggestion and coarsened the grid a bit. This way it worked impressively fast, but the flow is stable and remains laminar, as I didn't add any perturbations. I need to kick the transition to have turbulence. If I add white noise, even very low magnitude, conjugate gradient solver blows up again. I also tried adding some sinusoidal perturbations to boundary conditions, and again had troubles with CG. I don't really get CG's extreme sensitivity to perturbations. Any suggestion is much appreciated. Thanks in advance. Cheers, Asim On 02/08/2016 04:48 PM, Sherwin, Spencer J wrote: HI Asim, How many parallel cores are you running on. Sometime starting up these flows can be tricky especially if you are immediately jumping to a high Reynolds number. Have you tried first starting the flow at a Lower Reynolds number? Also 100 x 200 is quite a few elements in the x-y plane. Remember the polynomial order adds in more points on top of the mesh discretisation. I would perhaps recommend trying a smaller mesh to see how that goes first. Actually I note there is a file called TurbChFl_3D1H.xml in the ~/Nektar/Solvers/IncNavierStokesSolver/Examples directory which might be worth looking at. I think this was a mesh used in Ale Bolis’ thesis which you can find under: http://wwwf.imperial.ac.uk/ssherw/spectralhp/papers/PhDThesis/Bolis_Thesis.p... Cheers, Spencer. On 1 Feb 2016, at 07:01, ceeao<mailto:ceeao@nus.edu.sg><ceeao@nus.edu.sg><mailto:ceeao@nus.edu.sg> wrote: Hi Spencer, Thank you for the quick reply and suggestion. I switched indeed to 3D homo 1D case and this time I have problems with the divergence of linear solvers. I refined the grid in the channel flow example to 100x200x64 in x-y-z directions, and left everything else the same. When I employ the default global system solver "IterativeStaticCond" with this setup, I get divergence: "Exceeded maximum number of iterations (5000)". I checked the initial fields and mesh in Paraview, everything seems to be normal. I also tried the "LowEnergyBlock" preconditioner, and apparently this one is valid only in sheer 3D cases. My knowledge in iterative solvers for hp-Fem is minimal. Therefore, I was wondering if you could suggest maybe a robust option that at least converge. My concern is getting some rough estimates for the speed of Nektar++ in my oscillating channel flow problem. If the speed will be promising, I will switch to Nektar++ from OpenFOAM, as OpenFOAM is low-order and not really suitable for DNS. Thanks again in advance. Cheers, Asim On 01/31/2016 11:53 PM, Sherwin, Spencer J wrote: Hi Asim, I think your conclusions is correct. We did some early implementation into the 2D Homogeneous expansion but have not pulled it all the way through since we did not have a full project on this topic. We have however kept the existing code running through our regression test. For now I would perhaps suggest you try the 3D homo 1D approach for your runs since you can use parallelisation in that code. Cheers, Spencer. On 29 Jan 2016, at 04:00, ceeao<ceeao@nus.edu.sg><mailto:ceeao@nus.edu.sg> wrote: Dear all, I just installed the library, and need to simulate DNS of a channel flow with oscillating pressure gradient. As I have two homogeneous directions I applied standard Fourier discretization in these directions. It seems like this case is not parallelized yet, and I got the error in the subject. I was wondering if I'm overlooking something. If not, are there maybe any plans in the future to include parallelization of 2D FFT's? Thank you in advance. Best, Asim Onder Research Fellow National University of Singapore ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. _______________________________________________ Nektar-users mailing list Nektar-users@imperial.ac.uk<mailto:Nektar-users@imperial.ac.uk> https://mailman.ic.ac.uk/mailman/listinfo/nektar-users Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ <mailto:s.sherwin@imperial.ac.uk>s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you.
Hi Asim, Douglas may have the most experience with this size calculation. I have to admit it is a bit of a challenge currently. One suggestion is that you run with the -v option on FieldConvert so we can see where it is taking most of the time. I have had problems in 3D with simply readying the xml file and so we had done a bit of restricting to help this. I do not know if this might still be problem with the Homogeneous 1D code. If this is the case then in the 3D code what we sometimes do is repartition the mesh using FieldConvert - - part-only=16 config.xml out.fld This will produce a directory called config_xml with files called P0000000.xml P0000001.xml I then try and process one file at a time ./FieldConvert config_xml/P000000.xml config_10.chk out.vtu I wonder if this would help break up the work and hopefully speed up the processing? Cheers, Spencer. On 27 Apr 2016, at 08:36, Asim Onder <ceeao@nus.edu.sg<mailto:ceeao@nus.edu.sg>> wrote: Hi Douglas, Spencer, Thanks for the suggestions, the problem is gone. I'm now a little concerned about the postprocessing of this relatively big case. For example, calculating vorticity from a snapshot in a chk folder takes several hours if I use a command like this: mpirun -np 720 FieldConvert -m vorticity config.xml config_10.chk vorticity_10.chk Changing the #procs didn't help too much. If I try to process individual domains one by one with something like this: FieldConvert --nprocs 72 --procid 1 -m vorticity config.xml config_10.chk vorticity_10.vtu It still seem to take hours. Just for a comparison: for this case, one time step of IncNavierStokesSolver takes around 5 seconds on 1440 procs with an initialization time of around 5mins. I guess I'm doing something wrong. Would you have any suggestions on this? Thank a lot in advance. Cheers, Asim On 04/22/2016 03:22 AM, Serson, Douglas wrote: Hi Asim, One thing I noticed about your setup is that HomModesZ / npz = 3. This should always be an even number, so you will need to change your parameters (for example using npz = 180). I am surprised no error message with this information was displayed, but this will definitely make your simulation crash. In terms of IO, as Spencer said you can pre-partition the mesh. However, I don't think this will make much difference since your mesh is 2D, and therefore does not use much memory anyway. As for the checkpoint file, as far as I know each process only tries to load one file at a time. If your checkpoint was obtained from a simulation with many cores, each file will be relatively small, and you should not have any problems. Cheers, Douglas ________________________________ From: Sherwin, Spencer J Sent: 21 April 2016 19:34 To: Asim Onder Cc: Serson, Douglas; nektar-users; Moxey, David Subject: Re: [Nektar-users] Parallel transposition not implemented yet for 3D-Homo-2D approach Hi Asim, In fully 3D simulations we tend to pre-partition the mesh and this can help with memory usage on a single core. To do this you can run the solver with the option - - part-only=’no of partitions of 2D planes’ Then instead of running with a file.xml you give the solver file_xml directory. However I am not sure whether this is all working with the 2.3 D code. Douglas is this how you start any of your runs? Cheers, Spencer. On 20 Apr 2016, at 05:48, Asim Onder <ceeao@nus.edu.sg<mailto:ceeao@nus.edu.sg>> wrote: Hi Douglas, thanks for the feedback. I was aware of --npz parallelization but was using a small number, not 1/2 or 1/4 of HomModesZ. Increasing npz really helped. I still have to try GlobalSysSoln. Now I face a memory problem for another case. The simulation runs out of memory when starting from a checkpoint file. Here is a little bit information about this case: - Mesh is made of around 16000 quad elements with p=5, i.e., NUMMODES="6" TYPE="MODIFIED" in xy, and HomModesZ=1080 in z direction. - I'm trying to run this case on 60 computing nodes each equipped with 24 processors, and a memory of 105 gb. In total, it makes 1440 procs, and 6300gb memory. - Execution command: mpirun -np 1440 IncNavierStokesSolver --npz 360 config.xml I was wondering if the memory usage of the application is scaling on different cores during IO, or using only one core. If it is only one core, than if it exceeds 105gb, it crushes I guess. Would you have maybe any suggestion/comment on this? Thanks, Asim On 04/13/2016 12:12 AM, Serson, Douglas wrote: Hi Asim, Concerning your questions: 1- Are you using the command line argument --npz? This is very important for obtaining an efficient parallel performance with the Fourier expansion, since it defines the number of partitions in the z-direction. If it is not set, only the xy plane will be partitioned and the parallelism will saturate quickly. I suggest initially setting npz to 1/2 or 1/4 of HomModesZ (note that nprocs must be a multiple of npz, since nprocs/npz is the number of partitions in the xy plane). Also, depending on your particular case and the number of partitions you have in the xy plane, your simulation may benefit from using a direct solver for the linear systems. This can be activated by adding '-I GlobalSysSoln=XxtMultiLevelStaticCond' to the command line. This is usually more efficient for a small number of partitions, but considering the large size of your problem it might be worth trying it. 2- I am not sure what could be causing that. I suppose it would help if you could send the exact commands you are using to run FieldConvert. Cheers, Douglas ________________________________ From: Asim Onder <mailto:ceeao@nus.edu.sg> <ceeao@nus.edu.sg><mailto:ceeao@nus.edu.sg> Sent: 12 April 2016 06:42 To: Sherwin, Spencer J; Serson, Douglas Cc: nektar-users; Moxey, David Subject: Re: [Nektar-users] Parallel transposition not implemented yet for 3D-Homo-2D approach Dear Spencer, Douglas, Nektar-users, I'm involved now in testing of a local petascale supercomputer, and for some quite limited time I can use several thousand processors for my DNS study. My test case is oscillating flow over a rippled bed. I build up a dense unstructured grid with p=6 quadrilateral elements in x-y, and Fourier expansions in z directions. In total I have circa half billion dofs per variable. I would have a few questions about this relatively large case: 1. I noticed that scaling gets inefficient after around 500 procs, let's say parallel efficiency goes below 80%. I was wondering if you would have any general suggestions to tune the configurations for a better scaling. 2. Postprocessing vorticity and Q criterion is not working for this case. At the of the execution Fieldconvert writes some small files without the field data. What could be the reason for this? Thanks you in advance for your suggestions. Cheers, Asim On 03/21/2016 04:16 AM, Sherwin, Spencer J wrote: Hi Asim, To follow-up on Douglas’ comment we are trying to get more organised to sort out a developers guide. We are also holding a user meeting in June. If you were able to make this we could also try and have a session on getting you going on the developmental side of things. Cheers, Spencer. On 17 Mar 2016, at 14:58, Serson, Douglas <<mailto:d.serson14@imperial.ac.uk>d.serson14@imperial.ac.uk<mailto:d.serson14@imperial.ac.uk>> wrote: Hi Asim, I am glad that your simulation is now working. About your questions: 1. We have some work done on a filter for calculating Reynolds stresses as the simulation progresses, but it is not ready yet, and it would not provide all the statistics you want. Since you already have a lot of chk files, I suppose the best way would indeed be using a script to process all of them with FieldConvert. 2. Yes, this has been recently included in FieldConvert, using the new 'meanmode' module. 3. I just checked that, and apparently this is caused by a bug when using this module without fftw. This should be fixed soon, but as an alternative this module should work if you switch fftw on (just add <I PROPERTY="USEFFT" VALUE="FFTW"/> to you session file, if the code was compiled with support to fftw). 4. I think there is some work towards a developer guide, but I don't how advanced is the progress on that. I am sure Spencer will be able to provide you with more information on that. Cheers, Douglas ________________________________________ From: Asim Onder <<mailto:ceeao@nus.edu.sg>ceeao@nus.edu.sg<mailto:ceeao@nus.edu.sg>> Sent: 17 March 2016 09:10 To: Serson, Douglas; Sherwin, Spencer J Cc: nektar-users; Moxey, David Subject: Re: [Nektar-users] Parallel transposition not implemented yet for 3D-Homo-2D approach Hi Spencer, Douglas, Thanks to your suggestions I managed to get the turbulent regime for the oscillatory channel flow. I have now completed the DNS study for one case, and built up a large database with checkpoint (*chk) files. I would like to calculate turbulent statistics using this database, especially for second order terms, e.g. Reynolds stresses and turbulent dissipation, and third order terms, e.g. turbulent diffusion terms. However, I am a little bit confused how I could achieve this. I would appreciate if you could give some hints about the following: 1. The only way I could think of to calculate turbulent statistics is to write a simple bash script to iterate over chk files, and apply various existing/extended FieldConvert operations on individual chk files. This would require some additional storage to store the intermediate steps, and therefore would be a bit cumbersome. Would it be any simpler way directly doing this directly in Nektar++? 2. I have one homogeneous direction, for which I used Fourier expansions. I would like to apply spatial averaging over this homogeneous direction. Does Nektar++ already contain such functionality? 3. I want to use 'wss' in Fieldconvert module to calculate wall shear stress. However, it returns segmentation fault. Any ideas why it could be? 4. I was wondering if there is any introductory document for basic programming in Nektar++. User guide does not contain information about programming. It would be nice to have some additional information to Doxygen documentation. Thank you very much in advance for your feedback. Cheers, Asim On 02/15/2016 11:59 PM, Serson, Douglas wrote: Hi Asim, As Spencer mentioned, svv can help in stabilizing your solution. You can find information on how to set it up in the user guide (pages 92-93), but basically all you need to do is use: <I PROPERTY="SpectralVanishingViscosity" VALUE="True"/> You can also tune it by setting the parameters SVVCutoffRatio and SVVDiffCoeff, but I would suggest starting with the default parameters. Also, you can use the parameter IO_CFLSteps to output the CFL number. This way you can check if the time step you are using is appropriate. Cheers, Douglas From: Sherwin, Spencer J Sent: 14 February 2016 19:46 To: ceeao Cc: nektar-users; Serson, Douglas; Moxey, David Subject: Re: [Nektar-users] Parallel transposition not implemented yet for 3D-Homo-2D approach Hi Asim, Getting a flow through transition is very challenging since there is a strong localisation of shear and this can lead to aliasing issues which can then cause instabilities. Both Douglas and Dave have experienced this with recent simulations so I am cc’ing them to make some suggestions. I would be inclined to be using spectralhpdealiasing and svv. Hopefully Douglas can send you an example of how to switch this on. Cheers, Spencer. On 11 Feb 2016, at 10:32, ceeao<<mailto:ceeao@nus.edu.sg>ceeao@nus.edu.sg<mailto:ceeao@nus.edu.sg>> wrote: Hi Spencer, Nektar-Users, I followed the suggestion and coarsened the grid a bit. This way it worked impressively fast, but the flow is stable and remains laminar, as I didn't add any perturbations. I need to kick the transition to have turbulence. If I add white noise, even very low magnitude, conjugate gradient solver blows up again. I also tried adding some sinusoidal perturbations to boundary conditions, and again had troubles with CG. I don't really get CG's extreme sensitivity to perturbations. Any suggestion is much appreciated. Thanks in advance. Cheers, Asim On 02/08/2016 04:48 PM, Sherwin, Spencer J wrote: HI Asim, How many parallel cores are you running on. Sometime starting up these flows can be tricky especially if you are immediately jumping to a high Reynolds number. Have you tried first starting the flow at a Lower Reynolds number? Also 100 x 200 is quite a few elements in the x-y plane. Remember the polynomial order adds in more points on top of the mesh discretisation. I would perhaps recommend trying a smaller mesh to see how that goes first. Actually I note there is a file called TurbChFl_3D1H.xml in the ~/Nektar/Solvers/IncNavierStokesSolver/Examples directory which might be worth looking at. I think this was a mesh used in Ale Bolis’ thesis which you can find under: http://wwwf.imperial.ac.uk/ssherw/spectralhp/papers/PhDThesis/Bolis_Thesis.p... Cheers, Spencer. On 1 Feb 2016, at 07:01, ceeao<mailto:ceeao@nus.edu.sg><ceeao@nus.edu.sg><mailto:ceeao@nus.edu.sg> wrote: Hi Spencer, Thank you for the quick reply and suggestion. I switched indeed to 3D homo 1D case and this time I have problems with the divergence of linear solvers. I refined the grid in the channel flow example to 100x200x64 in x-y-z directions, and left everything else the same. When I employ the default global system solver "IterativeStaticCond" with this setup, I get divergence: "Exceeded maximum number of iterations (5000)". I checked the initial fields and mesh in Paraview, everything seems to be normal. I also tried the "LowEnergyBlock" preconditioner, and apparently this one is valid only in sheer 3D cases. My knowledge in iterative solvers for hp-Fem is minimal. Therefore, I was wondering if you could suggest maybe a robust option that at least converge. My concern is getting some rough estimates for the speed of Nektar++ in my oscillating channel flow problem. If the speed will be promising, I will switch to Nektar++ from OpenFOAM, as OpenFOAM is low-order and not really suitable for DNS. Thanks again in advance. Cheers, Asim On 01/31/2016 11:53 PM, Sherwin, Spencer J wrote: Hi Asim, I think your conclusions is correct. We did some early implementation into the 2D Homogeneous expansion but have not pulled it all the way through since we did not have a full project on this topic. We have however kept the existing code running through our regression test. For now I would perhaps suggest you try the 3D homo 1D approach for your runs since you can use parallelisation in that code. Cheers, Spencer. On 29 Jan 2016, at 04:00, ceeao<ceeao@nus.edu.sg><mailto:ceeao@nus.edu.sg> wrote: Dear all, I just installed the library, and need to simulate DNS of a channel flow with oscillating pressure gradient. As I have two homogeneous directions I applied standard Fourier discretization in these directions. It seems like this case is not parallelized yet, and I got the error in the subject. I was wondering if I'm overlooking something. If not, are there maybe any plans in the future to include parallelization of 2D FFT's? Thank you in advance. Best, Asim Onder Research Fellow National University of Singapore ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. _______________________________________________ Nektar-users mailing list Nektar-users@imperial.ac.uk<mailto:Nektar-users@imperial.ac.uk> https://mailman.ic.ac.uk/mailman/listinfo/nektar-users Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ <mailto:s.sherwin@imperial.ac.uk>s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052
Hi Spencer, I have partitioned my mesh into 48 pieces, and applied Fieldconvert -v as you have suggested: FieldConvert -v -m vorticity config_xml/P0000000.xml config_10.chk vorPart_10.vtu The end of the output file looks like this: ...... InputXml session reader CPU Time: 0.036654s InputXml mesh graph setup CPU Time: 0.0949287s InputXml setexpansion CPU Time: 77.2126s InputXml setexpansion CPU Time: 5.66e-07s Collection Implemenation for Quadrilateral ( 6 6 ) for ngeoms = 648 BwdTrans: StdMat (0.000246074, 0.000233187, 6.70384e-05, 0.000117696) IProductWRTBase: StdMat (0.000299029, 0.000254921, 8.57054e-05, 0.000164536) IProductWRTDerivBase: StdMat (0.00147705, 0.000787602, 0.000234766, 0.000425167) PhysDeriv: SumFac (0.000471923, 0.000315652, 0.000244664, 0.000203107) InputXml set first exp CPU Time: 7453.92s InputXml CPU Time: 7531.26s Processing input fld file InputFld CPU Time: 211.413s ProcessVorticity: Calculating vorticity... OutputVtk: Writing file... Writing: "vorPart_12.vtu" Written file: vorPart_12.vtu Total CPU Time: 8059.78s "InputXml set first exp" seems to be consuming the most time. What would this correspond? Thanks, Asim On 05/02/2016 06:12 PM, Sherwin, Spencer J wrote: Hi Asim, Douglas may have the most experience with this size calculation. I have to admit it is a bit of a challenge currently. One suggestion is that you run with the -v option on FieldConvert so we can see where it is taking most of the time. I have had problems in 3D with simply readying the xml file and so we had done a bit of restricting to help this. I do not know if this might still be problem with the Homogeneous 1D code. If this is the case then in the 3D code what we sometimes do is repartition the mesh using FieldConvert - - part-only=16 config.xml out.fld This will produce a directory called config_xml with files called P0000000.xml P0000001.xml I then try and process one file at a time ./FieldConvert config_xml/P000000.xml config_10.chk out.vtu I wonder if this would help break up the work and hopefully speed up the processing? Cheers, Spencer. On 27 Apr 2016, at 08:36, Asim Onder <<mailto:ceeao@nus.edu.sg>ceeao@nus.edu.sg<mailto:ceeao@nus.edu.sg>> wrote: Hi Douglas, Spencer, Thanks for the suggestions, the problem is gone. I'm now a little concerned about the postprocessing of this relatively big case. For example, calculating vorticity from a snapshot in a chk folder takes several hours if I use a command like this: mpirun -np 720 FieldConvert -m vorticity config.xml config_10.chk vorticity_10.chk Changing the #procs didn't help too much. If I try to process individual domains one by one with something like this: FieldConvert --nprocs 72 --procid 1 -m vorticity config.xml config_10.chk vorticity_10.vtu It still seem to take hours. Just for a comparison: for this case, one time step of IncNavierStokesSolver takes around 5 seconds on 1440 procs with an initialization time of around 5mins. I guess I'm doing something wrong. Would you have any suggestions on this? Thank a lot in advance. Cheers, Asim On 04/22/2016 03:22 AM, Serson, Douglas wrote: Hi Asim, One thing I noticed about your setup is that HomModesZ / npz = 3. This should always be an even number, so you will need to change your parameters (for example using npz = 180). I am surprised no error message with this information was displayed, but this will definitely make your simulation crash. In terms of IO, as Spencer said you can pre-partition the mesh. However, I don't think this will make much difference since your mesh is 2D, and therefore does not use much memory anyway. As for the checkpoint file, as far as I know each process only tries to load one file at a time. If your checkpoint was obtained from a simulation with many cores, each file will be relatively small, and you should not have any problems. Cheers, Douglas ________________________________ From: Sherwin, Spencer J Sent: 21 April 2016 19:34 To: Asim Onder Cc: Serson, Douglas; nektar-users; Moxey, David Subject: Re: [Nektar-users] Parallel transposition not implemented yet for 3D-Homo-2D approach Hi Asim, In fully 3D simulations we tend to pre-partition the mesh and this can help with memory usage on a single core. To do this you can run the solver with the option - - part-only=’no of partitions of 2D planes’ Then instead of running with a file.xml you give the solver file_xml directory. However I am not sure whether this is all working with the 2.3 D code. Douglas is this how you start any of your runs? Cheers, Spencer. On 20 Apr 2016, at 05:48, Asim Onder <ceeao@nus.edu.sg<mailto:ceeao@nus.edu.sg>> wrote: Hi Douglas, thanks for the feedback. I was aware of --npz parallelization but was using a small number, not 1/2 or 1/4 of HomModesZ. Increasing npz really helped. I still have to try GlobalSysSoln. Now I face a memory problem for another case. The simulation runs out of memory when starting from a checkpoint file. Here is a little bit information about this case: - Mesh is made of around 16000 quad elements with p=5, i.e., NUMMODES="6" TYPE="MODIFIED" in xy, and HomModesZ=1080 in z direction. - I'm trying to run this case on 60 computing nodes each equipped with 24 processors, and a memory of 105 gb. In total, it makes 1440 procs, and 6300gb memory. - Execution command: mpirun -np 1440 IncNavierStokesSolver --npz 360 config.xml I was wondering if the memory usage of the application is scaling on different cores during IO, or using only one core. If it is only one core, than if it exceeds 105gb, it crushes I guess. Would you have maybe any suggestion/comment on this? Thanks, Asim On 04/13/2016 12:12 AM, Serson, Douglas wrote: Hi Asim, Concerning your questions: 1- Are you using the command line argument --npz? This is very important for obtaining an efficient parallel performance with the Fourier expansion, since it defines the number of partitions in the z-direction. If it is not set, only the xy plane will be partitioned and the parallelism will saturate quickly. I suggest initially setting npz to 1/2 or 1/4 of HomModesZ (note that nprocs must be a multiple of npz, since nprocs/npz is the number of partitions in the xy plane). Also, depending on your particular case and the number of partitions you have in the xy plane, your simulation may benefit from using a direct solver for the linear systems. This can be activated by adding '-I GlobalSysSoln=XxtMultiLevelStaticCond' to the command line. This is usually more efficient for a small number of partitions, but considering the large size of your problem it might be worth trying it. 2- I am not sure what could be causing that. I suppose it would help if you could send the exact commands you are using to run FieldConvert. Cheers, Douglas ________________________________ From: Asim Onder <mailto:ceeao@nus.edu.sg> <ceeao@nus.edu.sg><mailto:ceeao@nus.edu.sg> Sent: 12 April 2016 06:42 To: Sherwin, Spencer J; Serson, Douglas Cc: nektar-users; Moxey, David Subject: Re: [Nektar-users] Parallel transposition not implemented yet for 3D-Homo-2D approach Dear Spencer, Douglas, Nektar-users, I'm involved now in testing of a local petascale supercomputer, and for some quite limited time I can use several thousand processors for my DNS study. My test case is oscillating flow over a rippled bed. I build up a dense unstructured grid with p=6 quadrilateral elements in x-y, and Fourier expansions in z directions. In total I have circa half billion dofs per variable. I would have a few questions about this relatively large case: 1. I noticed that scaling gets inefficient after around 500 procs, let's say parallel efficiency goes below 80%. I was wondering if you would have any general suggestions to tune the configurations for a better scaling. 2. Postprocessing vorticity and Q criterion is not working for this case. At the of the execution Fieldconvert writes some small files without the field data. What could be the reason for this? Thanks you in advance for your suggestions. Cheers, Asim On 03/21/2016 04:16 AM, Sherwin, Spencer J wrote: Hi Asim, To follow-up on Douglas’ comment we are trying to get more organised to sort out a developers guide. We are also holding a user meeting in June. If you were able to make this we could also try and have a session on getting you going on the developmental side of things. Cheers, Spencer. On 17 Mar 2016, at 14:58, Serson, Douglas <<mailto:d.serson14@imperial.ac.uk>d.serson14@imperial.ac.uk<mailto:d.serson14@imperial.ac.uk>> wrote: Hi Asim, I am glad that your simulation is now working. About your questions: 1. We have some work done on a filter for calculating Reynolds stresses as the simulation progresses, but it is not ready yet, and it would not provide all the statistics you want. Since you already have a lot of chk files, I suppose the best way would indeed be using a script to process all of them with FieldConvert. 2. Yes, this has been recently included in FieldConvert, using the new 'meanmode' module. 3. I just checked that, and apparently this is caused by a bug when using this module without fftw. This should be fixed soon, but as an alternative this module should work if you switch fftw on (just add <I PROPERTY="USEFFT" VALUE="FFTW"/> to you session file, if the code was compiled with support to fftw). 4. I think there is some work towards a developer guide, but I don't how advanced is the progress on that. I am sure Spencer will be able to provide you with more information on that. Cheers, Douglas ________________________________________ From: Asim Onder <<mailto:ceeao@nus.edu.sg>ceeao@nus.edu.sg<mailto:ceeao@nus.edu.sg>> Sent: 17 March 2016 09:10 To: Serson, Douglas; Sherwin, Spencer J Cc: nektar-users; Moxey, David Subject: Re: [Nektar-users] Parallel transposition not implemented yet for 3D-Homo-2D approach Hi Spencer, Douglas, Thanks to your suggestions I managed to get the turbulent regime for the oscillatory channel flow. I have now completed the DNS study for one case, and built up a large database with checkpoint (*chk) files. I would like to calculate turbulent statistics using this database, especially for second order terms, e.g. Reynolds stresses and turbulent dissipation, and third order terms, e.g. turbulent diffusion terms. However, I am a little bit confused how I could achieve this. I would appreciate if you could give some hints about the following: 1. The only way I could think of to calculate turbulent statistics is to write a simple bash script to iterate over chk files, and apply various existing/extended FieldConvert operations on individual chk files. This would require some additional storage to store the intermediate steps, and therefore would be a bit cumbersome. Would it be any simpler way directly doing this directly in Nektar++? 2. I have one homogeneous direction, for which I used Fourier expansions. I would like to apply spatial averaging over this homogeneous direction. Does Nektar++ already contain such functionality? 3. I want to use 'wss' in Fieldconvert module to calculate wall shear stress. However, it returns segmentation fault. Any ideas why it could be? 4. I was wondering if there is any introductory document for basic programming in Nektar++. User guide does not contain information about programming. It would be nice to have some additional information to Doxygen documentation. Thank you very much in advance for your feedback. Cheers, Asim On 02/15/2016 11:59 PM, Serson, Douglas wrote: Hi Asim, As Spencer mentioned, svv can help in stabilizing your solution. You can find information on how to set it up in the user guide (pages 92-93), but basically all you need to do is use: <I PROPERTY="SpectralVanishingViscosity" VALUE="True"/> You can also tune it by setting the parameters SVVCutoffRatio and SVVDiffCoeff, but I would suggest starting with the default parameters. Also, you can use the parameter IO_CFLSteps to output the CFL number. This way you can check if the time step you are using is appropriate. Cheers, Douglas From: Sherwin, Spencer J Sent: 14 February 2016 19:46 To: ceeao Cc: nektar-users; Serson, Douglas; Moxey, David Subject: Re: [Nektar-users] Parallel transposition not implemented yet for 3D-Homo-2D approach Hi Asim, Getting a flow through transition is very challenging since there is a strong localisation of shear and this can lead to aliasing issues which can then cause instabilities. Both Douglas and Dave have experienced this with recent simulations so I am cc’ing them to make some suggestions. I would be inclined to be using spectralhpdealiasing and svv. Hopefully Douglas can send you an example of how to switch this on. Cheers, Spencer. On 11 Feb 2016, at 10:32, ceeao<<mailto:ceeao@nus.edu.sg>ceeao@nus.edu.sg<mailto:ceeao@nus.edu.sg>> wrote: Hi Spencer, Nektar-Users, I followed the suggestion and coarsened the grid a bit. This way it worked impressively fast, but the flow is stable and remains laminar, as I didn't add any perturbations. I need to kick the transition to have turbulence. If I add white noise, even very low magnitude, conjugate gradient solver blows up again. I also tried adding some sinusoidal perturbations to boundary conditions, and again had troubles with CG. I don't really get CG's extreme sensitivity to perturbations. Any suggestion is much appreciated. Thanks in advance. Cheers, Asim On 02/08/2016 04:48 PM, Sherwin, Spencer J wrote: HI Asim, How many parallel cores are you running on. Sometime starting up these flows can be tricky especially if you are immediately jumping to a high Reynolds number. Have you tried first starting the flow at a Lower Reynolds number? Also 100 x 200 is quite a few elements in the x-y plane. Remember the polynomial order adds in more points on top of the mesh discretisation. I would perhaps recommend trying a smaller mesh to see how that goes first. Actually I note there is a file called TurbChFl_3D1H.xml in the ~/Nektar/Solvers/IncNavierStokesSolver/Examples directory which might be worth looking at. I think this was a mesh used in Ale Bolis’ thesis which you can find under: <http://wwwf.imperial.ac.uk/ssherw/spectralhp/papers/PhDThesis/Bolis_Thesis.pdf>http://wwwf.imperial.ac.uk/ssherw/spectralhp/papers/PhDThesis/Bolis_Thesis.pdf Cheers, Spencer. On 1 Feb 2016, at 07:01, ceeao<mailto:ceeao@nus.edu.sg><ceeao@nus.edu.sg><mailto:ceeao@nus.edu.sg> wrote: Hi Spencer, Thank you for the quick reply and suggestion. I switched indeed to 3D homo 1D case and this time I have problems with the divergence of linear solvers. I refined the grid in the channel flow example to 100x200x64 in x-y-z directions, and left everything else the same. When I employ the default global system solver "IterativeStaticCond" with this setup, I get divergence: "Exceeded maximum number of iterations (5000)". I checked the initial fields and mesh in Paraview, everything seems to be normal. I also tried the "LowEnergyBlock" preconditioner, and apparently this one is valid only in sheer 3D cases. My knowledge in iterative solvers for hp-Fem is minimal. Therefore, I was wondering if you could suggest maybe a robust option that at least converge. My concern is getting some rough estimates for the speed of Nektar++ in my oscillating channel flow problem. If the speed will be promising, I will switch to Nektar++ from OpenFOAM, as OpenFOAM is low-order and not really suitable for DNS. Thanks again in advance. Cheers, Asim On 01/31/2016 11:53 PM, Sherwin, Spencer J wrote: Hi Asim, I think your conclusions is correct. We did some early implementation into the 2D Homogeneous expansion but have not pulled it all the way through since we did not have a full project on this topic. We have however kept the existing code running through our regression test. For now I would perhaps suggest you try the 3D homo 1D approach for your runs since you can use parallelisation in that code. Cheers, Spencer. On 29 Jan 2016, at 04:00, ceeao<mailto:ceeao@nus.edu.sg><ceeao@nus.edu.sg><mailto:ceeao@nus.edu.sg> wrote: Dear all, I just installed the library, and need to simulate DNS of a channel flow with oscillating pressure gradient. As I have two homogeneous directions I applied standard Fourier discretization in these directions. It seems like this case is not parallelized yet, and I got the error in the subject. I was wondering if I'm overlooking something. If not, are there maybe any plans in the future to include parallelization of 2D FFT's? Thank you in advance. Best, Asim Onder Research Fellow National University of Singapore ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. _______________________________________________ Nektar-users mailing list <mailto:Nektar-users@imperial.ac.uk>Nektar-users@imperial.ac.uk<mailto:Nektar-users@imperial.ac.uk> <https://mailman.ic.ac.uk/mailman/listinfo/nektar-users>https://mailman.ic.ac.uk/mailman/listinfo/nektar-users Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ <mailto:s.sherwin@imperial.ac.uk>s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ <mailto:s.sherwin@imperial.ac.uk>s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ <mailto:s.sherwin@imperial.ac.uk>s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ <mailto:s.sherwin@imperial.ac.uk>s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ <mailto:s.sherwin@imperial.ac.uk>s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you.
Hi Asim, The si what I was afraid of. I do not know why your case is still taking so long. Can you send me the .xml file to have a look at. Thanks, Spencer. On 3 May 2016, at 10:17, Asim Onder <ceeao@nus.edu.sg<mailto:ceeao@nus.edu.sg>> wrote: Hi Spencer, I have partitioned my mesh into 48 pieces, and applied Fieldconvert -v as you have suggested: FieldConvert -v -m vorticity config_xml/P0000000.xml config_10.chk vorPart_10.vtu The end of the output file looks like this: ...... InputXml session reader CPU Time: 0.036654s InputXml mesh graph setup CPU Time: 0.0949287s InputXml setexpansion CPU Time: 77.2126s InputXml setexpansion CPU Time: 5.66e-07s Collection Implemenation for Quadrilateral ( 6 6 ) for ngeoms = 648 BwdTrans: StdMat (0.000246074, 0.000233187, 6.70384e-05, 0.000117696) IProductWRTBase: StdMat (0.000299029, 0.000254921, 8.57054e-05, 0.000164536) IProductWRTDerivBase: StdMat (0.00147705, 0.000787602, 0.000234766, 0.000425167) PhysDeriv: SumFac (0.000471923, 0.000315652, 0.000244664, 0.000203107) InputXml set first exp CPU Time: 7453.92s InputXml CPU Time: 7531.26s Processing input fld file InputFld CPU Time: 211.413s ProcessVorticity: Calculating vorticity... OutputVtk: Writing file... Writing: "vorPart_12.vtu" Written file: vorPart_12.vtu Total CPU Time: 8059.78s "InputXml set first exp" seems to be consuming the most time. What would this correspond? Thanks, Asim On 05/02/2016 06:12 PM, Sherwin, Spencer J wrote: Hi Asim, Douglas may have the most experience with this size calculation. I have to admit it is a bit of a challenge currently. One suggestion is that you run with the -v option on FieldConvert so we can see where it is taking most of the time. I have had problems in 3D with simply readying the xml file and so we had done a bit of restricting to help this. I do not know if this might still be problem with the Homogeneous 1D code. If this is the case then in the 3D code what we sometimes do is repartition the mesh using FieldConvert - - part-only=16 config.xml out.fld This will produce a directory called config_xml with files called P0000000.xml P0000001.xml I then try and process one file at a time ./FieldConvert config_xml/P000000.xml config_10.chk out.vtu I wonder if this would help break up the work and hopefully speed up the processing? Cheers, Spencer. On 27 Apr 2016, at 08:36, Asim Onder <<mailto:ceeao@nus.edu.sg>ceeao@nus.edu.sg<mailto:ceeao@nus.edu.sg>> wrote: Hi Douglas, Spencer, Thanks for the suggestions, the problem is gone. I'm now a little concerned about the postprocessing of this relatively big case. For example, calculating vorticity from a snapshot in a chk folder takes several hours if I use a command like this: mpirun -np 720 FieldConvert -m vorticity config.xml config_10.chk vorticity_10.chk Changing the #procs didn't help too much. If I try to process individual domains one by one with something like this: FieldConvert --nprocs 72 --procid 1 -m vorticity config.xml config_10.chk vorticity_10.vtu It still seem to take hours. Just for a comparison: for this case, one time step of IncNavierStokesSolver takes around 5 seconds on 1440 procs with an initialization time of around 5mins. I guess I'm doing something wrong. Would you have any suggestions on this? Thank a lot in advance. Cheers, Asim On 04/22/2016 03:22 AM, Serson, Douglas wrote: Hi Asim, One thing I noticed about your setup is that HomModesZ / npz = 3. This should always be an even number, so you will need to change your parameters (for example using npz = 180). I am surprised no error message with this information was displayed, but this will definitely make your simulation crash. In terms of IO, as Spencer said you can pre-partition the mesh. However, I don't think this will make much difference since your mesh is 2D, and therefore does not use much memory anyway. As for the checkpoint file, as far as I know each process only tries to load one file at a time. If your checkpoint was obtained from a simulation with many cores, each file will be relatively small, and you should not have any problems. Cheers, Douglas ________________________________ From: Sherwin, Spencer J Sent: 21 April 2016 19:34 To: Asim Onder Cc: Serson, Douglas; nektar-users; Moxey, David Subject: Re: [Nektar-users] Parallel transposition not implemented yet for 3D-Homo-2D approach Hi Asim, In fully 3D simulations we tend to pre-partition the mesh and this can help with memory usage on a single core. To do this you can run the solver with the option - - part-only=’no of partitions of 2D planes’ Then instead of running with a file.xml you give the solver file_xml directory. However I am not sure whether this is all working with the 2.3 D code. Douglas is this how you start any of your runs? Cheers, Spencer. On 20 Apr 2016, at 05:48, Asim Onder <ceeao@nus.edu.sg<mailto:ceeao@nus.edu.sg>> wrote: Hi Douglas, thanks for the feedback. I was aware of --npz parallelization but was using a small number, not 1/2 or 1/4 of HomModesZ. Increasing npz really helped. I still have to try GlobalSysSoln. Now I face a memory problem for another case. The simulation runs out of memory when starting from a checkpoint file. Here is a little bit information about this case: - Mesh is made of around 16000 quad elements with p=5, i.e., NUMMODES="6" TYPE="MODIFIED" in xy, and HomModesZ=1080 in z direction. - I'm trying to run this case on 60 computing nodes each equipped with 24 processors, and a memory of 105 gb. In total, it makes 1440 procs, and 6300gb memory. - Execution command: mpirun -np 1440 IncNavierStokesSolver --npz 360 config.xml I was wondering if the memory usage of the application is scaling on different cores during IO, or using only one core. If it is only one core, than if it exceeds 105gb, it crushes I guess. Would you have maybe any suggestion/comment on this? Thanks, Asim On 04/13/2016 12:12 AM, Serson, Douglas wrote: Hi Asim, Concerning your questions: 1- Are you using the command line argument --npz? This is very important for obtaining an efficient parallel performance with the Fourier expansion, since it defines the number of partitions in the z-direction. If it is not set, only the xy plane will be partitioned and the parallelism will saturate quickly. I suggest initially setting npz to 1/2 or 1/4 of HomModesZ (note that nprocs must be a multiple of npz, since nprocs/npz is the number of partitions in the xy plane). Also, depending on your particular case and the number of partitions you have in the xy plane, your simulation may benefit from using a direct solver for the linear systems. This can be activated by adding '-I GlobalSysSoln=XxtMultiLevelStaticCond' to the command line. This is usually more efficient for a small number of partitions, but considering the large size of your problem it might be worth trying it. 2- I am not sure what could be causing that. I suppose it would help if you could send the exact commands you are using to run FieldConvert. Cheers, Douglas ________________________________ From: Asim Onder <mailto:ceeao@nus.edu.sg> <ceeao@nus.edu.sg><mailto:ceeao@nus.edu.sg> Sent: 12 April 2016 06:42 To: Sherwin, Spencer J; Serson, Douglas Cc: nektar-users; Moxey, David Subject: Re: [Nektar-users] Parallel transposition not implemented yet for 3D-Homo-2D approach Dear Spencer, Douglas, Nektar-users, I'm involved now in testing of a local petascale supercomputer, and for some quite limited time I can use several thousand processors for my DNS study. My test case is oscillating flow over a rippled bed. I build up a dense unstructured grid with p=6 quadrilateral elements in x-y, and Fourier expansions in z directions. In total I have circa half billion dofs per variable. I would have a few questions about this relatively large case: 1. I noticed that scaling gets inefficient after around 500 procs, let's say parallel efficiency goes below 80%. I was wondering if you would have any general suggestions to tune the configurations for a better scaling. 2. Postprocessing vorticity and Q criterion is not working for this case. At the of the execution Fieldconvert writes some small files without the field data. What could be the reason for this? Thanks you in advance for your suggestions. Cheers, Asim On 03/21/2016 04:16 AM, Sherwin, Spencer J wrote: Hi Asim, To follow-up on Douglas’ comment we are trying to get more organised to sort out a developers guide. We are also holding a user meeting in June. If you were able to make this we could also try and have a session on getting you going on the developmental side of things. Cheers, Spencer. On 17 Mar 2016, at 14:58, Serson, Douglas <<mailto:d.serson14@imperial.ac.uk>d.serson14@imperial.ac.uk<mailto:d.serson14@imperial.ac.uk>> wrote: Hi Asim, I am glad that your simulation is now working. About your questions: 1. We have some work done on a filter for calculating Reynolds stresses as the simulation progresses, but it is not ready yet, and it would not provide all the statistics you want. Since you already have a lot of chk files, I suppose the best way would indeed be using a script to process all of them with FieldConvert. 2. Yes, this has been recently included in FieldConvert, using the new 'meanmode' module. 3. I just checked that, and apparently this is caused by a bug when using this module without fftw. This should be fixed soon, but as an alternative this module should work if you switch fftw on (just add <I PROPERTY="USEFFT" VALUE="FFTW"/> to you session file, if the code was compiled with support to fftw). 4. I think there is some work towards a developer guide, but I don't how advanced is the progress on that. I am sure Spencer will be able to provide you with more information on that. Cheers, Douglas ________________________________________ From: Asim Onder <<mailto:ceeao@nus.edu.sg>ceeao@nus.edu.sg<mailto:ceeao@nus.edu.sg>> Sent: 17 March 2016 09:10 To: Serson, Douglas; Sherwin, Spencer J Cc: nektar-users; Moxey, David Subject: Re: [Nektar-users] Parallel transposition not implemented yet for 3D-Homo-2D approach Hi Spencer, Douglas, Thanks to your suggestions I managed to get the turbulent regime for the oscillatory channel flow. I have now completed the DNS study for one case, and built up a large database with checkpoint (*chk) files. I would like to calculate turbulent statistics using this database, especially for second order terms, e.g. Reynolds stresses and turbulent dissipation, and third order terms, e.g. turbulent diffusion terms. However, I am a little bit confused how I could achieve this. I would appreciate if you could give some hints about the following: 1. The only way I could think of to calculate turbulent statistics is to write a simple bash script to iterate over chk files, and apply various existing/extended FieldConvert operations on individual chk files. This would require some additional storage to store the intermediate steps, and therefore would be a bit cumbersome. Would it be any simpler way directly doing this directly in Nektar++? 2. I have one homogeneous direction, for which I used Fourier expansions. I would like to apply spatial averaging over this homogeneous direction. Does Nektar++ already contain such functionality? 3. I want to use 'wss' in Fieldconvert module to calculate wall shear stress. However, it returns segmentation fault. Any ideas why it could be? 4. I was wondering if there is any introductory document for basic programming in Nektar++. User guide does not contain information about programming. It would be nice to have some additional information to Doxygen documentation. Thank you very much in advance for your feedback. Cheers, Asim On 02/15/2016 11:59 PM, Serson, Douglas wrote: Hi Asim, As Spencer mentioned, svv can help in stabilizing your solution. You can find information on how to set it up in the user guide (pages 92-93), but basically all you need to do is use: <I PROPERTY="SpectralVanishingViscosity" VALUE="True"/> You can also tune it by setting the parameters SVVCutoffRatio and SVVDiffCoeff, but I would suggest starting with the default parameters. Also, you can use the parameter IO_CFLSteps to output the CFL number. This way you can check if the time step you are using is appropriate. Cheers, Douglas From: Sherwin, Spencer J Sent: 14 February 2016 19:46 To: ceeao Cc: nektar-users; Serson, Douglas; Moxey, David Subject: Re: [Nektar-users] Parallel transposition not implemented yet for 3D-Homo-2D approach Hi Asim, Getting a flow through transition is very challenging since there is a strong localisation of shear and this can lead to aliasing issues which can then cause instabilities. Both Douglas and Dave have experienced this with recent simulations so I am cc’ing them to make some suggestions. I would be inclined to be using spectralhpdealiasing and svv. Hopefully Douglas can send you an example of how to switch this on. Cheers, Spencer. On 11 Feb 2016, at 10:32, ceeao<<mailto:ceeao@nus.edu.sg>ceeao@nus.edu.sg<mailto:ceeao@nus.edu.sg>> wrote: Hi Spencer, Nektar-Users, I followed the suggestion and coarsened the grid a bit. This way it worked impressively fast, but the flow is stable and remains laminar, as I didn't add any perturbations. I need to kick the transition to have turbulence. If I add white noise, even very low magnitude, conjugate gradient solver blows up again. I also tried adding some sinusoidal perturbations to boundary conditions, and again had troubles with CG. I don't really get CG's extreme sensitivity to perturbations. Any suggestion is much appreciated. Thanks in advance. Cheers, Asim On 02/08/2016 04:48 PM, Sherwin, Spencer J wrote: HI Asim, How many parallel cores are you running on. Sometime starting up these flows can be tricky especially if you are immediately jumping to a high Reynolds number. Have you tried first starting the flow at a Lower Reynolds number? Also 100 x 200 is quite a few elements in the x-y plane. Remember the polynomial order adds in more points on top of the mesh discretisation. I would perhaps recommend trying a smaller mesh to see how that goes first. Actually I note there is a file called TurbChFl_3D1H.xml in the ~/Nektar/Solvers/IncNavierStokesSolver/Examples directory which might be worth looking at. I think this was a mesh used in Ale Bolis’ thesis which you can find under: <http://wwwf.imperial.ac.uk/ssherw/spectralhp/papers/PhDThesis/Bolis_Thesis.pdf>http://wwwf.imperial.ac.uk/ssherw/spectralhp/papers/PhDThesis/Bolis_Thesis.pdf Cheers, Spencer. On 1 Feb 2016, at 07:01, ceeao<mailto:ceeao@nus.edu.sg><ceeao@nus.edu.sg><mailto:ceeao@nus.edu.sg> wrote: Hi Spencer, Thank you for the quick reply and suggestion. I switched indeed to 3D homo 1D case and this time I have problems with the divergence of linear solvers. I refined the grid in the channel flow example to 100x200x64 in x-y-z directions, and left everything else the same. When I employ the default global system solver "IterativeStaticCond" with this setup, I get divergence: "Exceeded maximum number of iterations (5000)". I checked the initial fields and mesh in Paraview, everything seems to be normal. I also tried the "LowEnergyBlock" preconditioner, and apparently this one is valid only in sheer 3D cases. My knowledge in iterative solvers for hp-Fem is minimal. Therefore, I was wondering if you could suggest maybe a robust option that at least converge. My concern is getting some rough estimates for the speed of Nektar++ in my oscillating channel flow problem. If the speed will be promising, I will switch to Nektar++ from OpenFOAM, as OpenFOAM is low-order and not really suitable for DNS. Thanks again in advance. Cheers, Asim On 01/31/2016 11:53 PM, Sherwin, Spencer J wrote: Hi Asim, I think your conclusions is correct. We did some early implementation into the 2D Homogeneous expansion but have not pulled it all the way through since we did not have a full project on this topic. We have however kept the existing code running through our regression test. For now I would perhaps suggest you try the 3D homo 1D approach for your runs since you can use parallelisation in that code. Cheers, Spencer. On 29 Jan 2016, at 04:00, ceeao<mailto:ceeao@nus.edu.sg><ceeao@nus.edu.sg><mailto:ceeao@nus.edu.sg> wrote: Dear all, I just installed the library, and need to simulate DNS of a channel flow with oscillating pressure gradient. As I have two homogeneous directions I applied standard Fourier discretization in these directions. It seems like this case is not parallelized yet, and I got the error in the subject. I was wondering if I'm overlooking something. If not, are there maybe any plans in the future to include parallelization of 2D FFT's? Thank you in advance. Best, Asim Onder Research Fellow National University of Singapore ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. _______________________________________________ Nektar-users mailing list <mailto:Nektar-users@imperial.ac.uk>Nektar-users@imperial.ac.uk<mailto:Nektar-users@imperial.ac.uk> <https://mailman.ic.ac.uk/mailman/listinfo/nektar-users>https://mailman.ic.ac.uk/mailman/listinfo/nektar-users Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ <mailto:s.sherwin@imperial.ac.uk>s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ <mailto:s.sherwin@imperial.ac.uk>s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ <mailto:s.sherwin@imperial.ac.uk>s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ <mailto:s.sherwin@imperial.ac.uk>s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ <mailto:s.sherwin@imperial.ac.uk>s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052
Hi Spencer, please find the requested .xml file in attachment. Cheers, Asim On 05/03/2016 10:15 PM, Sherwin, Spencer J wrote: Hi Asim, The si what I was afraid of. I do not know why your case is still taking so long. Can you send me the .xml file to have a look at. Thanks, Spencer. On 3 May 2016, at 10:17, Asim Onder <<mailto:ceeao@nus.edu.sg>ceeao@nus.edu.sg<mailto:ceeao@nus.edu.sg>> wrote: Hi Spencer, I have partitioned my mesh into 48 pieces, and applied Fieldconvert -v as you have suggested: FieldConvert -v -m vorticity config_xml/P0000000.xml config_10.chk vorPart_10.vtu The end of the output file looks like this: ...... InputXml session reader CPU Time: 0.036654s InputXml mesh graph setup CPU Time: 0.0949287s InputXml setexpansion CPU Time: 77.2126s InputXml setexpansion CPU Time: 5.66e-07s Collection Implemenation for Quadrilateral ( 6 6 ) for ngeoms = 648 BwdTrans: StdMat (0.000246074, 0.000233187, 6.70384e-05, 0.000117696) IProductWRTBase: StdMat (0.000299029, 0.000254921, 8.57054e-05, 0.000164536) IProductWRTDerivBase: StdMat (0.00147705, 0.000787602, 0.000234766, 0.000425167) PhysDeriv: SumFac (0.000471923, 0.000315652, 0.000244664, 0.000203107) InputXml set first exp CPU Time: 7453.92s InputXml CPU Time: 7531.26s Processing input fld file InputFld CPU Time: 211.413s ProcessVorticity: Calculating vorticity... OutputVtk: Writing file... Writing: "vorPart_12.vtu" Written file: vorPart_12.vtu Total CPU Time: 8059.78s "InputXml set first exp" seems to be consuming the most time. What would this correspond? Thanks, Asim On 05/02/2016 06:12 PM, Sherwin, Spencer J wrote: Hi Asim, Douglas may have the most experience with this size calculation. I have to admit it is a bit of a challenge currently. One suggestion is that you run with the -v option on FieldConvert so we can see where it is taking most of the time. I have had problems in 3D with simply readying the xml file and so we had done a bit of restricting to help this. I do not know if this might still be problem with the Homogeneous 1D code. If this is the case then in the 3D code what we sometimes do is repartition the mesh using FieldConvert - - part-only=16 config.xml out.fld This will produce a directory called config_xml with files called P0000000.xml P0000001.xml I then try and process one file at a time ./FieldConvert config_xml/P000000.xml config_10.chk out.vtu I wonder if this would help break up the work and hopefully speed up the processing? Cheers, Spencer. On 27 Apr 2016, at 08:36, Asim Onder <ceeao@nus.edu.sg<mailto:ceeao@nus.edu.sg>> wrote: Hi Douglas, Spencer, Thanks for the suggestions, the problem is gone. I'm now a little concerned about the postprocessing of this relatively big case. For example, calculating vorticity from a snapshot in a chk folder takes several hours if I use a command like this: mpirun -np 720 FieldConvert -m vorticity config.xml config_10.chk vorticity_10.chk Changing the #procs didn't help too much. If I try to process individual domains one by one with something like this: FieldConvert --nprocs 72 --procid 1 -m vorticity config.xml config_10.chk vorticity_10.vtu It still seem to take hours. Just for a comparison: for this case, one time step of IncNavierStokesSolver takes around 5 seconds on 1440 procs with an initialization time of around 5mins. I guess I'm doing something wrong. Would you have any suggestions on this? Thank a lot in advance. Cheers, Asim On 04/22/2016 03:22 AM, Serson, Douglas wrote: Hi Asim, One thing I noticed about your setup is that HomModesZ / npz = 3. This should always be an even number, so you will need to change your parameters (for example using npz = 180). I am surprised no error message with this information was displayed, but this will definitely make your simulation crash. In terms of IO, as Spencer said you can pre-partition the mesh. However, I don't think this will make much difference since your mesh is 2D, and therefore does not use much memory anyway. As for the checkpoint file, as far as I know each process only tries to load one file at a time. If your checkpoint was obtained from a simulation with many cores, each file will be relatively small, and you should not have any problems. Cheers, Douglas ________________________________ From: Sherwin, Spencer J Sent: 21 April 2016 19:34 To: Asim Onder Cc: Serson, Douglas; nektar-users; Moxey, David Subject: Re: [Nektar-users] Parallel transposition not implemented yet for 3D-Homo-2D approach Hi Asim, In fully 3D simulations we tend to pre-partition the mesh and this can help with memory usage on a single core. To do this you can run the solver with the option - - part-only=’no of partitions of 2D planes’ Then instead of running with a file.xml you give the solver file_xml directory. However I am not sure whether this is all working with the 2.3 D code. Douglas is this how you start any of your runs? Cheers, Spencer. On 20 Apr 2016, at 05:48, Asim Onder <<mailto:ceeao@nus.edu.sg>ceeao@nus.edu.sg<mailto:ceeao@nus.edu.sg>> wrote: Hi Douglas, thanks for the feedback. I was aware of --npz parallelization but was using a small number, not 1/2 or 1/4 of HomModesZ. Increasing npz really helped. I still have to try GlobalSysSoln. Now I face a memory problem for another case. The simulation runs out of memory when starting from a checkpoint file. Here is a little bit information about this case: - Mesh is made of around 16000 quad elements with p=5, i.e., NUMMODES="6" TYPE="MODIFIED" in xy, and HomModesZ=1080 in z direction. - I'm trying to run this case on 60 computing nodes each equipped with 24 processors, and a memory of 105 gb. In total, it makes 1440 procs, and 6300gb memory. - Execution command: mpirun -np 1440 IncNavierStokesSolver --npz 360 config.xml I was wondering if the memory usage of the application is scaling on different cores during IO, or using only one core. If it is only one core, than if it exceeds 105gb, it crushes I guess. Would you have maybe any suggestion/comment on this? Thanks, Asim On 04/13/2016 12:12 AM, Serson, Douglas wrote: Hi Asim, Concerning your questions: 1- Are you using the command line argument --npz? This is very important for obtaining an efficient parallel performance with the Fourier expansion, since it defines the number of partitions in the z-direction. If it is not set, only the xy plane will be partitioned and the parallelism will saturate quickly. I suggest initially setting npz to 1/2 or 1/4 of HomModesZ (note that nprocs must be a multiple of npz, since nprocs/npz is the number of partitions in the xy plane). Also, depending on your particular case and the number of partitions you have in the xy plane, your simulation may benefit from using a direct solver for the linear systems. This can be activated by adding '-I GlobalSysSoln=XxtMultiLevelStaticCond' to the command line. This is usually more efficient for a small number of partitions, but considering the large size of your problem it might be worth trying it. 2- I am not sure what could be causing that. I suppose it would help if you could send the exact commands you are using to run FieldConvert. Cheers, Douglas ________________________________ From: Asim Onder <mailto:ceeao@nus.edu.sg> <ceeao@nus.edu.sg><mailto:ceeao@nus.edu.sg> Sent: 12 April 2016 06:42 To: Sherwin, Spencer J; Serson, Douglas Cc: nektar-users; Moxey, David Subject: Re: [Nektar-users] Parallel transposition not implemented yet for 3D-Homo-2D approach Dear Spencer, Douglas, Nektar-users, I'm involved now in testing of a local petascale supercomputer, and for some quite limited time I can use several thousand processors for my DNS study. My test case is oscillating flow over a rippled bed. I build up a dense unstructured grid with p=6 quadrilateral elements in x-y, and Fourier expansions in z directions. In total I have circa half billion dofs per variable. I would have a few questions about this relatively large case: 1. I noticed that scaling gets inefficient after around 500 procs, let's say parallel efficiency goes below 80%. I was wondering if you would have any general suggestions to tune the configurations for a better scaling. 2. Postprocessing vorticity and Q criterion is not working for this case. At the of the execution Fieldconvert writes some small files without the field data. What could be the reason for this? Thanks you in advance for your suggestions. Cheers, Asim On 03/21/2016 04:16 AM, Sherwin, Spencer J wrote: Hi Asim, To follow-up on Douglas’ comment we are trying to get more organised to sort out a developers guide. We are also holding a user meeting in June. If you were able to make this we could also try and have a session on getting you going on the developmental side of things. Cheers, Spencer. On 17 Mar 2016, at 14:58, Serson, Douglas <<mailto:d.serson14@imperial.ac.uk>d.serson14@imperial.ac.uk<mailto:d.serson14@imperial.ac.uk>> wrote: Hi Asim, I am glad that your simulation is now working. About your questions: 1. We have some work done on a filter for calculating Reynolds stresses as the simulation progresses, but it is not ready yet, and it would not provide all the statistics you want. Since you already have a lot of chk files, I suppose the best way would indeed be using a script to process all of them with FieldConvert. 2. Yes, this has been recently included in FieldConvert, using the new 'meanmode' module. 3. I just checked that, and apparently this is caused by a bug when using this module without fftw. This should be fixed soon, but as an alternative this module should work if you switch fftw on (just add <I PROPERTY="USEFFT" VALUE="FFTW"/> to you session file, if the code was compiled with support to fftw). 4. I think there is some work towards a developer guide, but I don't how advanced is the progress on that. I am sure Spencer will be able to provide you with more information on that. Cheers, Douglas ________________________________________ From: Asim Onder <<mailto:ceeao@nus.edu.sg>ceeao@nus.edu.sg<mailto:ceeao@nus.edu.sg>> Sent: 17 March 2016 09:10 To: Serson, Douglas; Sherwin, Spencer J Cc: nektar-users; Moxey, David Subject: Re: [Nektar-users] Parallel transposition not implemented yet for 3D-Homo-2D approach Hi Spencer, Douglas, Thanks to your suggestions I managed to get the turbulent regime for the oscillatory channel flow. I have now completed the DNS study for one case, and built up a large database with checkpoint (*chk) files. I would like to calculate turbulent statistics using this database, especially for second order terms, e.g. Reynolds stresses and turbulent dissipation, and third order terms, e.g. turbulent diffusion terms. However, I am a little bit confused how I could achieve this. I would appreciate if you could give some hints about the following: 1. The only way I could think of to calculate turbulent statistics is to write a simple bash script to iterate over chk files, and apply various existing/extended FieldConvert operations on individual chk files. This would require some additional storage to store the intermediate steps, and therefore would be a bit cumbersome. Would it be any simpler way directly doing this directly in Nektar++? 2. I have one homogeneous direction, for which I used Fourier expansions. I would like to apply spatial averaging over this homogeneous direction. Does Nektar++ already contain such functionality? 3. I want to use 'wss' in Fieldconvert module to calculate wall shear stress. However, it returns segmentation fault. Any ideas why it could be? 4. I was wondering if there is any introductory document for basic programming in Nektar++. User guide does not contain information about programming. It would be nice to have some additional information to Doxygen documentation. Thank you very much in advance for your feedback. Cheers, Asim On 02/15/2016 11:59 PM, Serson, Douglas wrote: Hi Asim, As Spencer mentioned, svv can help in stabilizing your solution. You can find information on how to set it up in the user guide (pages 92-93), but basically all you need to do is use: <I PROPERTY="SpectralVanishingViscosity" VALUE="True"/> You can also tune it by setting the parameters SVVCutoffRatio and SVVDiffCoeff, but I would suggest starting with the default parameters. Also, you can use the parameter IO_CFLSteps to output the CFL number. This way you can check if the time step you are using is appropriate. Cheers, Douglas From: Sherwin, Spencer J Sent: 14 February 2016 19:46 To: ceeao Cc: nektar-users; Serson, Douglas; Moxey, David Subject: Re: [Nektar-users] Parallel transposition not implemented yet for 3D-Homo-2D approach Hi Asim, Getting a flow through transition is very challenging since there is a strong localisation of shear and this can lead to aliasing issues which can then cause instabilities. Both Douglas and Dave have experienced this with recent simulations so I am cc’ing them to make some suggestions. I would be inclined to be using spectralhpdealiasing and svv. Hopefully Douglas can send you an example of how to switch this on. Cheers, Spencer. On 11 Feb 2016, at 10:32, ceeao<<mailto:ceeao@nus.edu.sg>ceeao@nus.edu.sg<mailto:ceeao@nus.edu.sg>> wrote: Hi Spencer, Nektar-Users, I followed the suggestion and coarsened the grid a bit. This way it worked impressively fast, but the flow is stable and remains laminar, as I didn't add any perturbations. I need to kick the transition to have turbulence. If I add white noise, even very low magnitude, conjugate gradient solver blows up again. I also tried adding some sinusoidal perturbations to boundary conditions, and again had troubles with CG. I don't really get CG's extreme sensitivity to perturbations. Any suggestion is much appreciated. Thanks in advance. Cheers, Asim On 02/08/2016 04:48 PM, Sherwin, Spencer J wrote: HI Asim, How many parallel cores are you running on. Sometime starting up these flows can be tricky especially if you are immediately jumping to a high Reynolds number. Have you tried first starting the flow at a Lower Reynolds number? Also 100 x 200 is quite a few elements in the x-y plane. Remember the polynomial order adds in more points on top of the mesh discretisation. I would perhaps recommend trying a smaller mesh to see how that goes first. Actually I note there is a file called TurbChFl_3D1H.xml in the ~/Nektar/Solvers/IncNavierStokesSolver/Examples directory which might be worth looking at. I think this was a mesh used in Ale Bolis’ thesis which you can find under: <http://wwwf.imperial.ac.uk/ssherw/spectralhp/papers/PhDThesis/Bolis_Thesis.pdf>http://wwwf.imperial.ac.uk/ssherw/spectralhp/papers/PhDThesis/Bolis_Thesis.pdf Cheers, Spencer. On 1 Feb 2016, at 07:01, ceeao<mailto:ceeao@nus.edu.sg><ceeao@nus.edu.sg><mailto:ceeao@nus.edu.sg> wrote: Hi Spencer, Thank you for the quick reply and suggestion. I switched indeed to 3D homo 1D case and this time I have problems with the divergence of linear solvers. I refined the grid in the channel flow example to 100x200x64 in x-y-z directions, and left everything else the same. When I employ the default global system solver "IterativeStaticCond" with this setup, I get divergence: "Exceeded maximum number of iterations (5000)". I checked the initial fields and mesh in Paraview, everything seems to be normal. I also tried the "LowEnergyBlock" preconditioner, and apparently this one is valid only in sheer 3D cases. My knowledge in iterative solvers for hp-Fem is minimal. Therefore, I was wondering if you could suggest maybe a robust option that at least converge. My concern is getting some rough estimates for the speed of Nektar++ in my oscillating channel flow problem. If the speed will be promising, I will switch to Nektar++ from OpenFOAM, as OpenFOAM is low-order and not really suitable for DNS. Thanks again in advance. Cheers, Asim On 01/31/2016 11:53 PM, Sherwin, Spencer J wrote: Hi Asim, I think your conclusions is correct. We did some early implementation into the 2D Homogeneous expansion but have not pulled it all the way through since we did not have a full project on this topic. We have however kept the existing code running through our regression test. For now I would perhaps suggest you try the 3D homo 1D approach for your runs since you can use parallelisation in that code. Cheers, Spencer. On 29 Jan 2016, at 04:00, ceeao<mailto:ceeao@nus.edu.sg><ceeao@nus.edu.sg><mailto:ceeao@nus.edu.sg> wrote: Dear all, I just installed the library, and need to simulate DNS of a channel flow with oscillating pressure gradient. As I have two homogeneous directions I applied standard Fourier discretization in these directions. It seems like this case is not parallelized yet, and I got the error in the subject. I was wondering if I'm overlooking something. If not, are there maybe any plans in the future to include parallelization of 2D FFT's? Thank you in advance. Best, Asim Onder Research Fellow National University of Singapore ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. _______________________________________________ Nektar-users mailing list <mailto:Nektar-users@imperial.ac.uk>Nektar-users@imperial.ac.uk<mailto:Nektar-users@imperial.ac.uk> <https://mailman.ic.ac.uk/mailman/listinfo/nektar-users>https://mailman.ic.ac.uk/mailman/listinfo/nektar-users Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ <mailto:s.sherwin@imperial.ac.uk>s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ <mailto:s.sherwin@imperial.ac.uk>s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ <mailto:s.sherwin@imperial.ac.uk>s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ <mailto:s.sherwin@imperial.ac.uk>s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ <mailto:s.sherwin@imperial.ac.uk>s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ <mailto:s.sherwin@imperial.ac.uk>s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you.
Hi Asim, Thank you for reporting this postprocessing issue. We did find an operation that was consuming an unreasonable amount of time. I already fixed that, and eventually this fix will be available in the master branch. If you want to test it before then, it is in the branch fix/FC3DH1Defficiency. Cheers, Douglas ________________________________ From: Asim Onder <ceeao@nus.edu.sg> Sent: 04 May 2016 07:07:45 To: Sherwin, Spencer J Cc: Serson, Douglas; nektar-users; Moxey, David Subject: Re: [Nektar-users] Parallel transposition not implemented yet for 3D-Homo-2D approach Hi Spencer, please find the requested .xml file in attachment. Cheers, Asim On 05/03/2016 10:15 PM, Sherwin, Spencer J wrote: Hi Asim, The si what I was afraid of. I do not know why your case is still taking so long. Can you send me the .xml file to have a look at. Thanks, Spencer. On 3 May 2016, at 10:17, Asim Onder <<mailto:ceeao@nus.edu.sg>ceeao@nus.edu.sg<mailto:ceeao@nus.edu.sg>> wrote: Hi Spencer, I have partitioned my mesh into 48 pieces, and applied Fieldconvert -v as you have suggested: FieldConvert -v -m vorticity config_xml/P0000000.xml config_10.chk vorPart_10.vtu The end of the output file looks like this: ...... InputXml session reader CPU Time: 0.036654s InputXml mesh graph setup CPU Time: 0.0949287s InputXml setexpansion CPU Time: 77.2126s InputXml setexpansion CPU Time: 5.66e-07s Collection Implemenation for Quadrilateral ( 6 6 ) for ngeoms = 648 BwdTrans: StdMat (0.000246074, 0.000233187, 6.70384e-05, 0.000117696) IProductWRTBase: StdMat (0.000299029, 0.000254921, 8.57054e-05, 0.000164536) IProductWRTDerivBase: StdMat (0.00147705, 0.000787602, 0.000234766, 0.000425167) PhysDeriv: SumFac (0.000471923, 0.000315652, 0.000244664, 0.000203107) InputXml set first exp CPU Time: 7453.92s InputXml CPU Time: 7531.26s Processing input fld file InputFld CPU Time: 211.413s ProcessVorticity: Calculating vorticity... OutputVtk: Writing file... Writing: "vorPart_12.vtu" Written file: vorPart_12.vtu Total CPU Time: 8059.78s "InputXml set first exp" seems to be consuming the most time. What would this correspond? Thanks, Asim On 05/02/2016 06:12 PM, Sherwin, Spencer J wrote: Hi Asim, Douglas may have the most experience with this size calculation. I have to admit it is a bit of a challenge currently. One suggestion is that you run with the -v option on FieldConvert so we can see where it is taking most of the time. I have had problems in 3D with simply readying the xml file and so we had done a bit of restricting to help this. I do not know if this might still be problem with the Homogeneous 1D code. If this is the case then in the 3D code what we sometimes do is repartition the mesh using FieldConvert - - part-only=16 config.xml out.fld This will produce a directory called config_xml with files called P0000000.xml P0000001.xml I then try and process one file at a time ./FieldConvert config_xml/P000000.xml config_10.chk out.vtu I wonder if this would help break up the work and hopefully speed up the processing? Cheers, Spencer. On 27 Apr 2016, at 08:36, Asim Onder <ceeao@nus.edu.sg<mailto:ceeao@nus.edu.sg>> wrote: Hi Douglas, Spencer, Thanks for the suggestions, the problem is gone. I'm now a little concerned about the postprocessing of this relatively big case. For example, calculating vorticity from a snapshot in a chk folder takes several hours if I use a command like this: mpirun -np 720 FieldConvert -m vorticity config.xml config_10.chk vorticity_10.chk Changing the #procs didn't help too much. If I try to process individual domains one by one with something like this: FieldConvert --nprocs 72 --procid 1 -m vorticity config.xml config_10.chk vorticity_10.vtu It still seem to take hours. Just for a comparison: for this case, one time step of IncNavierStokesSolver takes around 5 seconds on 1440 procs with an initialization time of around 5mins. I guess I'm doing something wrong. Would you have any suggestions on this? Thank a lot in advance. Cheers, Asim On 04/22/2016 03:22 AM, Serson, Douglas wrote: Hi Asim, One thing I noticed about your setup is that HomModesZ / npz = 3. This should always be an even number, so you will need to change your parameters (for example using npz = 180). I am surprised no error message with this information was displayed, but this will definitely make your simulation crash. In terms of IO, as Spencer said you can pre-partition the mesh. However, I don't think this will make much difference since your mesh is 2D, and therefore does not use much memory anyway. As for the checkpoint file, as far as I know each process only tries to load one file at a time. If your checkpoint was obtained from a simulation with many cores, each file will be relatively small, and you should not have any problems. Cheers, Douglas ________________________________ From: Sherwin, Spencer J Sent: 21 April 2016 19:34 To: Asim Onder Cc: Serson, Douglas; nektar-users; Moxey, David Subject: Re: [Nektar-users] Parallel transposition not implemented yet for 3D-Homo-2D approach Hi Asim, In fully 3D simulations we tend to pre-partition the mesh and this can help with memory usage on a single core. To do this you can run the solver with the option - - part-only=’no of partitions of 2D planes’ Then instead of running with a file.xml you give the solver file_xml directory. However I am not sure whether this is all working with the 2.3 D code. Douglas is this how you start any of your runs? Cheers, Spencer. On 20 Apr 2016, at 05:48, Asim Onder <<mailto:ceeao@nus.edu.sg>ceeao@nus.edu.sg<mailto:ceeao@nus.edu.sg>> wrote: Hi Douglas, thanks for the feedback. I was aware of --npz parallelization but was using a small number, not 1/2 or 1/4 of HomModesZ. Increasing npz really helped. I still have to try GlobalSysSoln. Now I face a memory problem for another case. The simulation runs out of memory when starting from a checkpoint file. Here is a little bit information about this case: - Mesh is made of around 16000 quad elements with p=5, i.e., NUMMODES="6" TYPE="MODIFIED" in xy, and HomModesZ=1080 in z direction. - I'm trying to run this case on 60 computing nodes each equipped with 24 processors, and a memory of 105 gb. In total, it makes 1440 procs, and 6300gb memory. - Execution command: mpirun -np 1440 IncNavierStokesSolver --npz 360 config.xml I was wondering if the memory usage of the application is scaling on different cores during IO, or using only one core. If it is only one core, than if it exceeds 105gb, it crushes I guess. Would you have maybe any suggestion/comment on this? Thanks, Asim On 04/13/2016 12:12 AM, Serson, Douglas wrote: Hi Asim, Concerning your questions: 1- Are you using the command line argument --npz? This is very important for obtaining an efficient parallel performance with the Fourier expansion, since it defines the number of partitions in the z-direction. If it is not set, only the xy plane will be partitioned and the parallelism will saturate quickly. I suggest initially setting npz to 1/2 or 1/4 of HomModesZ (note that nprocs must be a multiple of npz, since nprocs/npz is the number of partitions in the xy plane). Also, depending on your particular case and the number of partitions you have in the xy plane, your simulation may benefit from using a direct solver for the linear systems. This can be activated by adding '-I GlobalSysSoln=XxtMultiLevelStaticCond' to the command line. This is usually more efficient for a small number of partitions, but considering the large size of your problem it might be worth trying it. 2- I am not sure what could be causing that. I suppose it would help if you could send the exact commands you are using to run FieldConvert. Cheers, Douglas ________________________________ From: Asim Onder <mailto:ceeao@nus.edu.sg> <ceeao@nus.edu.sg><mailto:ceeao@nus.edu.sg> Sent: 12 April 2016 06:42 To: Sherwin, Spencer J; Serson, Douglas Cc: nektar-users; Moxey, David Subject: Re: [Nektar-users] Parallel transposition not implemented yet for 3D-Homo-2D approach Dear Spencer, Douglas, Nektar-users, I'm involved now in testing of a local petascale supercomputer, and for some quite limited time I can use several thousand processors for my DNS study. My test case is oscillating flow over a rippled bed. I build up a dense unstructured grid with p=6 quadrilateral elements in x-y, and Fourier expansions in z directions. In total I have circa half billion dofs per variable. I would have a few questions about this relatively large case: 1. I noticed that scaling gets inefficient after around 500 procs, let's say parallel efficiency goes below 80%. I was wondering if you would have any general suggestions to tune the configurations for a better scaling. 2. Postprocessing vorticity and Q criterion is not working for this case. At the of the execution Fieldconvert writes some small files without the field data. What could be the reason for this? Thanks you in advance for your suggestions. Cheers, Asim On 03/21/2016 04:16 AM, Sherwin, Spencer J wrote: Hi Asim, To follow-up on Douglas’ comment we are trying to get more organised to sort out a developers guide. We are also holding a user meeting in June. If you were able to make this we could also try and have a session on getting you going on the developmental side of things. Cheers, Spencer. On 17 Mar 2016, at 14:58, Serson, Douglas <<mailto:d.serson14@imperial.ac.uk>d.serson14@imperial.ac.uk<mailto:d.serson14@imperial.ac.uk>> wrote: Hi Asim, I am glad that your simulation is now working. About your questions: 1. We have some work done on a filter for calculating Reynolds stresses as the simulation progresses, but it is not ready yet, and it would not provide all the statistics you want. Since you already have a lot of chk files, I suppose the best way would indeed be using a script to process all of them with FieldConvert. 2. Yes, this has been recently included in FieldConvert, using the new 'meanmode' module. 3. I just checked that, and apparently this is caused by a bug when using this module without fftw. This should be fixed soon, but as an alternative this module should work if you switch fftw on (just add <I PROPERTY="USEFFT" VALUE="FFTW"/> to you session file, if the code was compiled with support to fftw). 4. I think there is some work towards a developer guide, but I don't how advanced is the progress on that. I am sure Spencer will be able to provide you with more information on that. Cheers, Douglas ________________________________________ From: Asim Onder <<mailto:ceeao@nus.edu.sg>ceeao@nus.edu.sg<mailto:ceeao@nus.edu.sg>> Sent: 17 March 2016 09:10 To: Serson, Douglas; Sherwin, Spencer J Cc: nektar-users; Moxey, David Subject: Re: [Nektar-users] Parallel transposition not implemented yet for 3D-Homo-2D approach Hi Spencer, Douglas, Thanks to your suggestions I managed to get the turbulent regime for the oscillatory channel flow. I have now completed the DNS study for one case, and built up a large database with checkpoint (*chk) files. I would like to calculate turbulent statistics using this database, especially for second order terms, e.g. Reynolds stresses and turbulent dissipation, and third order terms, e.g. turbulent diffusion terms. However, I am a little bit confused how I could achieve this. I would appreciate if you could give some hints about the following: 1. The only way I could think of to calculate turbulent statistics is to write a simple bash script to iterate over chk files, and apply various existing/extended FieldConvert operations on individual chk files. This would require some additional storage to store the intermediate steps, and therefore would be a bit cumbersome. Would it be any simpler way directly doing this directly in Nektar++? 2. I have one homogeneous direction, for which I used Fourier expansions. I would like to apply spatial averaging over this homogeneous direction. Does Nektar++ already contain such functionality? 3. I want to use 'wss' in Fieldconvert module to calculate wall shear stress. However, it returns segmentation fault. Any ideas why it could be? 4. I was wondering if there is any introductory document for basic programming in Nektar++. User guide does not contain information about programming. It would be nice to have some additional information to Doxygen documentation. Thank you very much in advance for your feedback. Cheers, Asim On 02/15/2016 11:59 PM, Serson, Douglas wrote: Hi Asim, As Spencer mentioned, svv can help in stabilizing your solution. You can find information on how to set it up in the user guide (pages 92-93), but basically all you need to do is use: <I PROPERTY="SpectralVanishingViscosity" VALUE="True"/> You can also tune it by setting the parameters SVVCutoffRatio and SVVDiffCoeff, but I would suggest starting with the default parameters. Also, you can use the parameter IO_CFLSteps to output the CFL number. This way you can check if the time step you are using is appropriate. Cheers, Douglas From: Sherwin, Spencer J Sent: 14 February 2016 19:46 To: ceeao Cc: nektar-users; Serson, Douglas; Moxey, David Subject: Re: [Nektar-users] Parallel transposition not implemented yet for 3D-Homo-2D approach Hi Asim, Getting a flow through transition is very challenging since there is a strong localisation of shear and this can lead to aliasing issues which can then cause instabilities. Both Douglas and Dave have experienced this with recent simulations so I am cc’ing them to make some suggestions. I would be inclined to be using spectralhpdealiasing and svv. Hopefully Douglas can send you an example of how to switch this on. Cheers, Spencer. On 11 Feb 2016, at 10:32, ceeao<<mailto:ceeao@nus.edu.sg>ceeao@nus.edu.sg<mailto:ceeao@nus.edu.sg>> wrote: Hi Spencer, Nektar-Users, I followed the suggestion and coarsened the grid a bit. This way it worked impressively fast, but the flow is stable and remains laminar, as I didn't add any perturbations. I need to kick the transition to have turbulence. If I add white noise, even very low magnitude, conjugate gradient solver blows up again. I also tried adding some sinusoidal perturbations to boundary conditions, and again had troubles with CG. I don't really get CG's extreme sensitivity to perturbations. Any suggestion is much appreciated. Thanks in advance. Cheers, Asim On 02/08/2016 04:48 PM, Sherwin, Spencer J wrote: HI Asim, How many parallel cores are you running on. Sometime starting up these flows can be tricky especially if you are immediately jumping to a high Reynolds number. Have you tried first starting the flow at a Lower Reynolds number? Also 100 x 200 is quite a few elements in the x-y plane. Remember the polynomial order adds in more points on top of the mesh discretisation. I would perhaps recommend trying a smaller mesh to see how that goes first. Actually I note there is a file called TurbChFl_3D1H.xml in the ~/Nektar/Solvers/IncNavierStokesSolver/Examples directory which might be worth looking at. I think this was a mesh used in Ale Bolis’ thesis which you can find under: <http://wwwf.imperial.ac.uk/ssherw/spectralhp/papers/PhDThesis/Bolis_Thesis.pdf>http://wwwf.imperial.ac.uk/ssherw/spectralhp/papers/PhDThesis/Bolis_Thesis.pdf Cheers, Spencer. On 1 Feb 2016, at 07:01, ceeao<mailto:ceeao@nus.edu.sg><ceeao@nus.edu.sg><mailto:ceeao@nus.edu.sg> wrote: Hi Spencer, Thank you for the quick reply and suggestion. I switched indeed to 3D homo 1D case and this time I have problems with the divergence of linear solvers. I refined the grid in the channel flow example to 100x200x64 in x-y-z directions, and left everything else the same. When I employ the default global system solver "IterativeStaticCond" with this setup, I get divergence: "Exceeded maximum number of iterations (5000)". I checked the initial fields and mesh in Paraview, everything seems to be normal. I also tried the "LowEnergyBlock" preconditioner, and apparently this one is valid only in sheer 3D cases. My knowledge in iterative solvers for hp-Fem is minimal. Therefore, I was wondering if you could suggest maybe a robust option that at least converge. My concern is getting some rough estimates for the speed of Nektar++ in my oscillating channel flow problem. If the speed will be promising, I will switch to Nektar++ from OpenFOAM, as OpenFOAM is low-order and not really suitable for DNS. Thanks again in advance. Cheers, Asim On 01/31/2016 11:53 PM, Sherwin, Spencer J wrote: Hi Asim, I think your conclusions is correct. We did some early implementation into the 2D Homogeneous expansion but have not pulled it all the way through since we did not have a full project on this topic. We have however kept the existing code running through our regression test. For now I would perhaps suggest you try the 3D homo 1D approach for your runs since you can use parallelisation in that code. Cheers, Spencer. On 29 Jan 2016, at 04:00, ceeao<mailto:ceeao@nus.edu.sg><ceeao@nus.edu.sg><mailto:ceeao@nus.edu.sg> wrote: Dear all, I just installed the library, and need to simulate DNS of a channel flow with oscillating pressure gradient. As I have two homogeneous directions I applied standard Fourier discretization in these directions. It seems like this case is not parallelized yet, and I got the error in the subject. I was wondering if I'm overlooking something. If not, are there maybe any plans in the future to include parallelization of 2D FFT's? Thank you in advance. Best, Asim Onder Research Fellow National University of Singapore ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. _______________________________________________ Nektar-users mailing list <mailto:Nektar-users@imperial.ac.uk>Nektar-users@imperial.ac.uk<mailto:Nektar-users@imperial.ac.uk> <https://mailman.ic.ac.uk/mailman/listinfo/nektar-users>https://mailman.ic.ac.uk/mailman/listinfo/nektar-users Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ <mailto:s.sherwin@imperial.ac.uk>s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ <mailto:s.sherwin@imperial.ac.uk>s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ <mailto:s.sherwin@imperial.ac.uk>s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ <mailto:s.sherwin@imperial.ac.uk>s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ <mailto:s.sherwin@imperial.ac.uk>s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ <mailto:s.sherwin@imperial.ac.uk>s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you.
hi Douglas, I am also cc’ing Yan since I believe he was also having trouble pos-processing the 3D1H cases. Thanks for your effort on this. Cheers, Spencer. On 6 May 2016, at 12:37, Serson, Douglas <d.serson14@imperial.ac.uk<mailto:d.serson14@imperial.ac.uk>> wrote: Hi Asim, Thank you for reporting this postprocessing issue. We did find an operation that was consuming an unreasonable amount of time. I already fixed that, and eventually this fix will be available in the master branch. If you want to test it before then, it is in the branch fix/FC3DH1Defficiency. Cheers, Douglas ________________________________ From: Asim Onder <ceeao@nus.edu.sg<mailto:ceeao@nus.edu.sg>> Sent: 04 May 2016 07:07:45 To: Sherwin, Spencer J Cc: Serson, Douglas; nektar-users; Moxey, David Subject: Re: [Nektar-users] Parallel transposition not implemented yet for 3D-Homo-2D approach Hi Spencer, please find the requested .xml file in attachment. Cheers, Asim On 05/03/2016 10:15 PM, Sherwin, Spencer J wrote: Hi Asim, The si what I was afraid of. I do not know why your case is still taking so long. Can you send me the .xml file to have a look at. Thanks, Spencer. On 3 May 2016, at 10:17, Asim Onder <<mailto:ceeao@nus.edu.sg>ceeao@nus.edu.sg<mailto:ceeao@nus.edu.sg>> wrote: Hi Spencer, I have partitioned my mesh into 48 pieces, and applied Fieldconvert -v as you have suggested: FieldConvert -v -m vorticity config_xml/P0000000.xml config_10.chk vorPart_10.vtu The end of the output file looks like this: ...... InputXml session reader CPU Time: 0.036654s InputXml mesh graph setup CPU Time: 0.0949287s InputXml setexpansion CPU Time: 77.2126s InputXml setexpansion CPU Time: 5.66e-07s Collection Implemenation for Quadrilateral ( 6 6 ) for ngeoms = 648 BwdTrans: StdMat (0.000246074, 0.000233187, 6.70384e-05, 0.000117696) IProductWRTBase: StdMat (0.000299029, 0.000254921, 8.57054e-05, 0.000164536) IProductWRTDerivBase: StdMat (0.00147705, 0.000787602, 0.000234766, 0.000425167) PhysDeriv: SumFac (0.000471923, 0.000315652, 0.000244664, 0.000203107) InputXml set first exp CPU Time: 7453.92s InputXml CPU Time: 7531.26s Processing input fld file InputFld CPU Time: 211.413s ProcessVorticity: Calculating vorticity... OutputVtk: Writing file... Writing: "vorPart_12.vtu" Written file: vorPart_12.vtu Total CPU Time: 8059.78s "InputXml set first exp" seems to be consuming the most time. What would this correspond? Thanks, Asim On 05/02/2016 06:12 PM, Sherwin, Spencer J wrote: Hi Asim, Douglas may have the most experience with this size calculation. I have to admit it is a bit of a challenge currently. One suggestion is that you run with the -v option on FieldConvert so we can see where it is taking most of the time. I have had problems in 3D with simply readying the xml file and so we had done a bit of restricting to help this. I do not know if this might still be problem with the Homogeneous 1D code. If this is the case then in the 3D code what we sometimes do is repartition the mesh using FieldConvert - - part-only=16 config.xml out.fld This will produce a directory called config_xml with files called P0000000.xml P0000001.xml I then try and process one file at a time ./FieldConvert config_xml/P000000.xml config_10.chk out.vtu I wonder if this would help break up the work and hopefully speed up the processing? Cheers, Spencer. On 27 Apr 2016, at 08:36, Asim Onder <ceeao@nus.edu.sg<mailto:ceeao@nus.edu.sg>> wrote: Hi Douglas, Spencer, Thanks for the suggestions, the problem is gone. I'm now a little concerned about the postprocessing of this relatively big case. For example, calculating vorticity from a snapshot in a chk folder takes several hours if I use a command like this: mpirun -np 720 FieldConvert -m vorticity config.xml config_10.chk vorticity_10.chk Changing the #procs didn't help too much. If I try to process individual domains one by one with something like this: FieldConvert --nprocs 72 --procid 1 -m vorticity config.xml config_10.chk vorticity_10.vtu It still seem to take hours. Just for a comparison: for this case, one time step of IncNavierStokesSolver takes around 5 seconds on 1440 procs with an initialization time of around 5mins. I guess I'm doing something wrong. Would you have any suggestions on this? Thank a lot in advance. Cheers, Asim On 04/22/2016 03:22 AM, Serson, Douglas wrote: Hi Asim, One thing I noticed about your setup is that HomModesZ / npz = 3. This should always be an even number, so you will need to change your parameters (for example using npz = 180). I am surprised no error message with this information was displayed, but this will definitely make your simulation crash. In terms of IO, as Spencer said you can pre-partition the mesh. However, I don't think this will make much difference since your mesh is 2D, and therefore does not use much memory anyway. As for the checkpoint file, as far as I know each process only tries to load one file at a time. If your checkpoint was obtained from a simulation with many cores, each file will be relatively small, and you should not have any problems. Cheers, Douglas ________________________________ From: Sherwin, Spencer J Sent: 21 April 2016 19:34 To: Asim Onder Cc: Serson, Douglas; nektar-users; Moxey, David Subject: Re: [Nektar-users] Parallel transposition not implemented yet for 3D-Homo-2D approach Hi Asim, In fully 3D simulations we tend to pre-partition the mesh and this can help with memory usage on a single core. To do this you can run the solver with the option - - part-only=’no of partitions of 2D planes’ Then instead of running with a file.xml you give the solver file_xml directory. However I am not sure whether this is all working with the 2.3 D code. Douglas is this how you start any of your runs? Cheers, Spencer. On 20 Apr 2016, at 05:48, Asim Onder <<mailto:ceeao@nus.edu.sg>ceeao@nus.edu.sg<mailto:ceeao@nus.edu.sg>> wrote: Hi Douglas, thanks for the feedback. I was aware of --npz parallelization but was using a small number, not 1/2 or 1/4 of HomModesZ. Increasing npz really helped. I still have to try GlobalSysSoln. Now I face a memory problem for another case. The simulation runs out of memory when starting from a checkpoint file. Here is a little bit information about this case: - Mesh is made of around 16000 quad elements with p=5, i.e., NUMMODES="6" TYPE="MODIFIED" in xy, and HomModesZ=1080 in z direction. - I'm trying to run this case on 60 computing nodes each equipped with 24 processors, and a memory of 105 gb. In total, it makes 1440 procs, and 6300gb memory. - Execution command: mpirun -np 1440 IncNavierStokesSolver --npz 360 config.xml I was wondering if the memory usage of the application is scaling on different cores during IO, or using only one core. If it is only one core, than if it exceeds 105gb, it crushes I guess. Would you have maybe any suggestion/comment on this? Thanks, Asim On 04/13/2016 12:12 AM, Serson, Douglas wrote: Hi Asim, Concerning your questions: 1- Are you using the command line argument --npz? This is very important for obtaining an efficient parallel performance with the Fourier expansion, since it defines the number of partitions in the z-direction. If it is not set, only the xy plane will be partitioned and the parallelism will saturate quickly. I suggest initially setting npz to 1/2 or 1/4 of HomModesZ (note that nprocs must be a multiple of npz, since nprocs/npz is the number of partitions in the xy plane). Also, depending on your particular case and the number of partitions you have in the xy plane, your simulation may benefit from using a direct solver for the linear systems. This can be activated by adding '-I GlobalSysSoln=XxtMultiLevelStaticCond' to the command line. This is usually more efficient for a small number of partitions, but considering the large size of your problem it might be worth trying it. 2- I am not sure what could be causing that. I suppose it would help if you could send the exact commands you are using to run FieldConvert. Cheers, Douglas ________________________________ From: Asim Onder <mailto:ceeao@nus.edu.sg> <ceeao@nus.edu.sg><mailto:ceeao@nus.edu.sg> Sent: 12 April 2016 06:42 To: Sherwin, Spencer J; Serson, Douglas Cc: nektar-users; Moxey, David Subject: Re: [Nektar-users] Parallel transposition not implemented yet for 3D-Homo-2D approach Dear Spencer, Douglas, Nektar-users, I'm involved now in testing of a local petascale supercomputer, and for some quite limited time I can use several thousand processors for my DNS study. My test case is oscillating flow over a rippled bed. I build up a dense unstructured grid with p=6 quadrilateral elements in x-y, and Fourier expansions in z directions. In total I have circa half billion dofs per variable. I would have a few questions about this relatively large case: 1. I noticed that scaling gets inefficient after around 500 procs, let's say parallel efficiency goes below 80%. I was wondering if you would have any general suggestions to tune the configurations for a better scaling. 2. Postprocessing vorticity and Q criterion is not working for this case. At the of the execution Fieldconvert writes some small files without the field data. What could be the reason for this? Thanks you in advance for your suggestions. Cheers, Asim On 03/21/2016 04:16 AM, Sherwin, Spencer J wrote: Hi Asim, To follow-up on Douglas’ comment we are trying to get more organised to sort out a developers guide. We are also holding a user meeting in June. If you were able to make this we could also try and have a session on getting you going on the developmental side of things. Cheers, Spencer. On 17 Mar 2016, at 14:58, Serson, Douglas <<mailto:d.serson14@imperial.ac.uk>d.serson14@imperial.ac.uk<mailto:d.serson14@imperial.ac.uk>> wrote: Hi Asim, I am glad that your simulation is now working. About your questions: 1. We have some work done on a filter for calculating Reynolds stresses as the simulation progresses, but it is not ready yet, and it would not provide all the statistics you want. Since you already have a lot of chk files, I suppose the best way would indeed be using a script to process all of them with FieldConvert. 2. Yes, this has been recently included in FieldConvert, using the new 'meanmode' module. 3. I just checked that, and apparently this is caused by a bug when using this module without fftw. This should be fixed soon, but as an alternative this module should work if you switch fftw on (just add <I PROPERTY="USEFFT" VALUE="FFTW"/> to you session file, if the code was compiled with support to fftw). 4. I think there is some work towards a developer guide, but I don't how advanced is the progress on that. I am sure Spencer will be able to provide you with more information on that. Cheers, Douglas ________________________________________ From: Asim Onder <<mailto:ceeao@nus.edu.sg>ceeao@nus.edu.sg<mailto:ceeao@nus.edu.sg>> Sent: 17 March 2016 09:10 To: Serson, Douglas; Sherwin, Spencer J Cc: nektar-users; Moxey, David Subject: Re: [Nektar-users] Parallel transposition not implemented yet for 3D-Homo-2D approach Hi Spencer, Douglas, Thanks to your suggestions I managed to get the turbulent regime for the oscillatory channel flow. I have now completed the DNS study for one case, and built up a large database with checkpoint (*chk) files. I would like to calculate turbulent statistics using this database, especially for second order terms, e.g. Reynolds stresses and turbulent dissipation, and third order terms, e.g. turbulent diffusion terms. However, I am a little bit confused how I could achieve this. I would appreciate if you could give some hints about the following: 1. The only way I could think of to calculate turbulent statistics is to write a simple bash script to iterate over chk files, and apply various existing/extended FieldConvert operations on individual chk files. This would require some additional storage to store the intermediate steps, and therefore would be a bit cumbersome. Would it be any simpler way directly doing this directly in Nektar++? 2. I have one homogeneous direction, for which I used Fourier expansions. I would like to apply spatial averaging over this homogeneous direction. Does Nektar++ already contain such functionality? 3. I want to use 'wss' in Fieldconvert module to calculate wall shear stress. However, it returns segmentation fault. Any ideas why it could be? 4. I was wondering if there is any introductory document for basic programming in Nektar++. User guide does not contain information about programming. It would be nice to have some additional information to Doxygen documentation. Thank you very much in advance for your feedback. Cheers, Asim On 02/15/2016 11:59 PM, Serson, Douglas wrote: Hi Asim, As Spencer mentioned, svv can help in stabilizing your solution. You can find information on how to set it up in the user guide (pages 92-93), but basically all you need to do is use: <I PROPERTY="SpectralVanishingViscosity" VALUE="True"/> You can also tune it by setting the parameters SVVCutoffRatio and SVVDiffCoeff, but I would suggest starting with the default parameters. Also, you can use the parameter IO_CFLSteps to output the CFL number. This way you can check if the time step you are using is appropriate. Cheers, Douglas From: Sherwin, Spencer J Sent: 14 February 2016 19:46 To: ceeao Cc: nektar-users; Serson, Douglas; Moxey, David Subject: Re: [Nektar-users] Parallel transposition not implemented yet for 3D-Homo-2D approach Hi Asim, Getting a flow through transition is very challenging since there is a strong localisation of shear and this can lead to aliasing issues which can then cause instabilities. Both Douglas and Dave have experienced this with recent simulations so I am cc’ing them to make some suggestions. I would be inclined to be using spectralhpdealiasing and svv. Hopefully Douglas can send you an example of how to switch this on. Cheers, Spencer. On 11 Feb 2016, at 10:32, ceeao<<mailto:ceeao@nus.edu.sg>ceeao@nus.edu.sg<mailto:ceeao@nus.edu.sg>> wrote: Hi Spencer, Nektar-Users, I followed the suggestion and coarsened the grid a bit. This way it worked impressively fast, but the flow is stable and remains laminar, as I didn't add any perturbations. I need to kick the transition to have turbulence. If I add white noise, even very low magnitude, conjugate gradient solver blows up again. I also tried adding some sinusoidal perturbations to boundary conditions, and again had troubles with CG. I don't really get CG's extreme sensitivity to perturbations. Any suggestion is much appreciated. Thanks in advance. Cheers, Asim On 02/08/2016 04:48 PM, Sherwin, Spencer J wrote: HI Asim, How many parallel cores are you running on. Sometime starting up these flows can be tricky especially if you are immediately jumping to a high Reynolds number. Have you tried first starting the flow at a Lower Reynolds number? Also 100 x 200 is quite a few elements in the x-y plane. Remember the polynomial order adds in more points on top of the mesh discretisation. I would perhaps recommend trying a smaller mesh to see how that goes first. Actually I note there is a file called TurbChFl_3D1H.xml in the ~/Nektar/Solvers/IncNavierStokesSolver/Examples directory which might be worth looking at. I think this was a mesh used in Ale Bolis’ thesis which you can find under: <http://wwwf.imperial.ac.uk/ssherw/spectralhp/papers/PhDThesis/Bolis_Thesis.pdf>http://wwwf.imperial.ac.uk/ssherw/spectralhp/papers/PhDThesis/Bolis_Thesis.pdf Cheers, Spencer. On 1 Feb 2016, at 07:01, ceeao<mailto:ceeao@nus.edu.sg><ceeao@nus.edu.sg><mailto:ceeao@nus.edu.sg> wrote: Hi Spencer, Thank you for the quick reply and suggestion. I switched indeed to 3D homo 1D case and this time I have problems with the divergence of linear solvers. I refined the grid in the channel flow example to 100x200x64 in x-y-z directions, and left everything else the same. When I employ the default global system solver "IterativeStaticCond" with this setup, I get divergence: "Exceeded maximum number of iterations (5000)". I checked the initial fields and mesh in Paraview, everything seems to be normal. I also tried the "LowEnergyBlock" preconditioner, and apparently this one is valid only in sheer 3D cases. My knowledge in iterative solvers for hp-Fem is minimal. Therefore, I was wondering if you could suggest maybe a robust option that at least converge. My concern is getting some rough estimates for the speed of Nektar++ in my oscillating channel flow problem. If the speed will be promising, I will switch to Nektar++ from OpenFOAM, as OpenFOAM is low-order and not really suitable for DNS. Thanks again in advance. Cheers, Asim On 01/31/2016 11:53 PM, Sherwin, Spencer J wrote: Hi Asim, I think your conclusions is correct. We did some early implementation into the 2D Homogeneous expansion but have not pulled it all the way through since we did not have a full project on this topic. We have however kept the existing code running through our regression test. For now I would perhaps suggest you try the 3D homo 1D approach for your runs since you can use parallelisation in that code. Cheers, Spencer. On 29 Jan 2016, at 04:00, ceeao<mailto:ceeao@nus.edu.sg><ceeao@nus.edu.sg><mailto:ceeao@nus.edu.sg> wrote: Dear all, I just installed the library, and need to simulate DNS of a channel flow with oscillating pressure gradient. As I have two homogeneous directions I applied standard Fourier discretization in these directions. It seems like this case is not parallelized yet, and I got the error in the subject. I was wondering if I'm overlooking something. If not, are there maybe any plans in the future to include parallelization of 2D FFT's? Thank you in advance. Best, Asim Onder Research Fellow National University of Singapore ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. _______________________________________________ Nektar-users mailing list <mailto:Nektar-users@imperial.ac.uk>Nektar-users@imperial.ac.uk<mailto:Nektar-users@imperial.ac.uk> <https://mailman.ic.ac.uk/mailman/listinfo/nektar-users>https://mailman.ic.ac.uk/mailman/listinfo/nektar-users Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ <mailto:s.sherwin@imperial.ac.uk>s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ <mailto:s.sherwin@imperial.ac.uk>s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ <mailto:s.sherwin@imperial.ac.uk>s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ <mailto:s.sherwin@imperial.ac.uk>s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ <mailto:s.sherwin@imperial.ac.uk>s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ <mailto:s.sherwin@imperial.ac.uk>s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052
Hi Douglas, I now have some troubles with postprocessing shear stress on a curved wall. Appreciate if you could provide some suggestions whenever you would be able to look at this. (just to remind my case: 3DH1D DNS of a channel flow with a wavy bottom. Around 16000 quad elements with NUMMODES="6" TYPE="MODIFIED" in xy, and HomModesZ=1080 in z direction.) There are three issues: 1. First, I tried to calculate the mean shear stress from a mean mode which is extracted using Fieldconvert, e.g.,: mpirun -np 48 FieldConvert -v -m meanmode config.xml fields/field_128.chk meanFields/mean_128.chk FieldConvert -v -m wss:bnd=1:addnormals=1 config.xml meanFields/mean_128.chk wssMean_128.chk Then, I converted the result to vtu and vizualized it. Normals to the surface are correctly calculated. However, shear stresses are zero, which is of course not true. 2. I also need to calculate shear-stress fluctuations and their probability distribution function. To this end, I partitioned the mesh into 10 units, and tried to extract instantaneous shear stress on the wall: FieldConvert -v -m wss:bnd=1:addnormals=1 config_xml/P0000001.xml fields/field_128.chk wallShearStress/wss_128_p01.chk and received a segmentation fault: ProcessWSS: Calculating wall shear stress... /var/spool/PBS/mom_priv/jobs/1511560.wlm01.SC: line 56: 40234 Segmentation fault FieldConvert -v -m wss:bnd=1:addnormals=1 config_xml/P0000001.xml fields/field_128.chk wallShearStress/wss_128_p01.chk I've put the log file for this one in attachment. 3. Finally, I need to calculate the drag force on the wall. wss returns shear stresses, along with pressure and normal vectors. I can use this information, and apply simple midpoint rule for integration. However, normal vectors seem to be normalized, hence of unity length, therefore, area information is missing. I was wondering if there is any easy way to extract the area of surface elements. Thanks you very much in advance for the feedback. Cheers, Asim On 05/06/2016 07:37 PM, Serson, Douglas wrote: Hi Asim, Thank you for reporting this postprocessing issue. We did find an operation that was consuming an unreasonable amount of time. I already fixed that, and eventually this fix will be available in the master branch. If you want to test it before then, it is in the branch fix/FC3DH1Defficiency. Cheers, Douglas ________________________________ From: Asim Onder <ceeao@nus.edu.sg><mailto:ceeao@nus.edu.sg> Sent: 04 May 2016 07:07:45 To: Sherwin, Spencer J Cc: Serson, Douglas; nektar-users; Moxey, David Subject: Re: [Nektar-users] Parallel transposition not implemented yet for 3D-Homo-2D approach Hi Spencer, please find the requested .xml file in attachment. Cheers, Asim On 05/03/2016 10:15 PM, Sherwin, Spencer J wrote: Hi Asim, The si what I was afraid of. I do not know why your case is still taking so long. Can you send me the .xml file to have a look at. Thanks, Spencer. On 3 May 2016, at 10:17, Asim Onder <<mailto:ceeao@nus.edu.sg>ceeao@nus.edu.sg<mailto:ceeao@nus.edu.sg>> wrote: Hi Spencer, I have partitioned my mesh into 48 pieces, and applied Fieldconvert -v as you have suggested: FieldConvert -v -m vorticity config_xml/P0000000.xml config_10.chk vorPart_10.vtu The end of the output file looks like this: ...... InputXml session reader CPU Time: 0.036654s InputXml mesh graph setup CPU Time: 0.0949287s InputXml setexpansion CPU Time: 77.2126s InputXml setexpansion CPU Time: 5.66e-07s Collection Implemenation for Quadrilateral ( 6 6 ) for ngeoms = 648 BwdTrans: StdMat (0.000246074, 0.000233187, 6.70384e-05, 0.000117696) IProductWRTBase: StdMat (0.000299029, 0.000254921, 8.57054e-05, 0.000164536) IProductWRTDerivBase: StdMat (0.00147705, 0.000787602, 0.000234766, 0.000425167) PhysDeriv: SumFac (0.000471923, 0.000315652, 0.000244664, 0.000203107) InputXml set first exp CPU Time: 7453.92s InputXml CPU Time: 7531.26s Processing input fld file InputFld CPU Time: 211.413s ProcessVorticity: Calculating vorticity... OutputVtk: Writing file... Writing: "vorPart_12.vtu" Written file: vorPart_12.vtu Total CPU Time: 8059.78s "InputXml set first exp" seems to be consuming the most time. What would this correspond? Thanks, Asim On 05/02/2016 06:12 PM, Sherwin, Spencer J wrote: Hi Asim, Douglas may have the most experience with this size calculation. I have to admit it is a bit of a challenge currently. One suggestion is that you run with the -v option on FieldConvert so we can see where it is taking most of the time. I have had problems in 3D with simply readying the xml file and so we had done a bit of restricting to help this. I do not know if this might still be problem with the Homogeneous 1D code. If this is the case then in the 3D code what we sometimes do is repartition the mesh using FieldConvert - - part-only=16 config.xml out.fld This will produce a directory called config_xml with files called P0000000.xml P0000001.xml I then try and process one file at a time ./FieldConvert config_xml/P000000.xml config_10.chk out.vtu I wonder if this would help break up the work and hopefully speed up the processing? Cheers, Spencer. On 27 Apr 2016, at 08:36, Asim Onder <ceeao@nus.edu.sg<mailto:ceeao@nus.edu.sg>> wrote: Hi Douglas, Spencer, Thanks for the suggestions, the problem is gone. I'm now a little concerned about the postprocessing of this relatively big case. For example, calculating vorticity from a snapshot in a chk folder takes several hours if I use a command like this: mpirun -np 720 FieldConvert -m vorticity config.xml config_10.chk vorticity_10.chk Changing the #procs didn't help too much. If I try to process individual domains one by one with something like this: FieldConvert --nprocs 72 --procid 1 -m vorticity config.xml config_10.chk vorticity_10.vtu It still seem to take hours. Just for a comparison: for this case, one time step of IncNavierStokesSolver takes around 5 seconds on 1440 procs with an initialization time of around 5mins. I guess I'm doing something wrong. Would you have any suggestions on this? Thank a lot in advance. Cheers, Asim On 04/22/2016 03:22 AM, Serson, Douglas wrote: Hi Asim, One thing I noticed about your setup is that HomModesZ / npz = 3. This should always be an even number, so you will need to change your parameters (for example using npz = 180). I am surprised no error message with this information was displayed, but this will definitely make your simulation crash. In terms of IO, as Spencer said you can pre-partition the mesh. However, I don't think this will make much difference since your mesh is 2D, and therefore does not use much memory anyway. As for the checkpoint file, as far as I know each process only tries to load one file at a time. If your checkpoint was obtained from a simulation with many cores, each file will be relatively small, and you should not have any problems. Cheers, Douglas ________________________________ From: Sherwin, Spencer J Sent: 21 April 2016 19:34 To: Asim Onder Cc: Serson, Douglas; nektar-users; Moxey, David Subject: Re: [Nektar-users] Parallel transposition not implemented yet for 3D-Homo-2D approach Hi Asim, In fully 3D simulations we tend to pre-partition the mesh and this can help with memory usage on a single core. To do this you can run the solver with the option - - part-only=’no of partitions of 2D planes’ Then instead of running with a file.xml you give the solver file_xml directory. However I am not sure whether this is all working with the 2.3 D code. Douglas is this how you start any of your runs? Cheers, Spencer. On 20 Apr 2016, at 05:48, Asim Onder <<mailto:ceeao@nus.edu.sg>ceeao@nus.edu.sg<mailto:ceeao@nus.edu.sg>> wrote: Hi Douglas, thanks for the feedback. I was aware of --npz parallelization but was using a small number, not 1/2 or 1/4 of HomModesZ. Increasing npz really helped. I still have to try GlobalSysSoln. Now I face a memory problem for another case. The simulation runs out of memory when starting from a checkpoint file. Here is a little bit information about this case: - Mesh is made of around 16000 quad elements with p=5, i.e., NUMMODES="6" TYPE="MODIFIED" in xy, and HomModesZ=1080 in z direction. - I'm trying to run this case on 60 computing nodes each equipped with 24 processors, and a memory of 105 gb. In total, it makes 1440 procs, and 6300gb memory. - Execution command: mpirun -np 1440 IncNavierStokesSolver --npz 360 config.xml I was wondering if the memory usage of the application is scaling on different cores during IO, or using only one core. If it is only one core, than if it exceeds 105gb, it crushes I guess. Would you have maybe any suggestion/comment on this? Thanks, Asim On 04/13/2016 12:12 AM, Serson, Douglas wrote: Hi Asim, Concerning your questions: 1- Are you using the command line argument --npz? This is very important for obtaining an efficient parallel performance with the Fourier expansion, since it defines the number of partitions in the z-direction. If it is not set, only the xy plane will be partitioned and the parallelism will saturate quickly. I suggest initially setting npz to 1/2 or 1/4 of HomModesZ (note that nprocs must be a multiple of npz, since nprocs/npz is the number of partitions in the xy plane). Also, depending on your particular case and the number of partitions you have in the xy plane, your simulation may benefit from using a direct solver for the linear systems. This can be activated by adding '-I GlobalSysSoln=XxtMultiLevelStaticCond' to the command line. This is usually more efficient for a small number of partitions, but considering the large size of your problem it might be worth trying it. 2- I am not sure what could be causing that. I suppose it would help if you could send the exact commands you are using to run FieldConvert. Cheers, Douglas ________________________________ From: Asim Onder <mailto:ceeao@nus.edu.sg> <ceeao@nus.edu.sg><mailto:ceeao@nus.edu.sg> Sent: 12 April 2016 06:42 To: Sherwin, Spencer J; Serson, Douglas Cc: nektar-users; Moxey, David Subject: Re: [Nektar-users] Parallel transposition not implemented yet for 3D-Homo-2D approach Dear Spencer, Douglas, Nektar-users, I'm involved now in testing of a local petascale supercomputer, and for some quite limited time I can use several thousand processors for my DNS study. My test case is oscillating flow over a rippled bed. I build up a dense unstructured grid with p=6 quadrilateral elements in x-y, and Fourier expansions in z directions. In total I have circa half billion dofs per variable. I would have a few questions about this relatively large case: 1. I noticed that scaling gets inefficient after around 500 procs, let's say parallel efficiency goes below 80%. I was wondering if you would have any general suggestions to tune the configurations for a better scaling. 2. Postprocessing vorticity and Q criterion is not working for this case. At the of the execution Fieldconvert writes some small files without the field data. What could be the reason for this? Thanks you in advance for your suggestions. Cheers, Asim On 03/21/2016 04:16 AM, Sherwin, Spencer J wrote: Hi Asim, To follow-up on Douglas’ comment we are trying to get more organised to sort out a developers guide. We are also holding a user meeting in June. If you were able to make this we could also try and have a session on getting you going on the developmental side of things. Cheers, Spencer. On 17 Mar 2016, at 14:58, Serson, Douglas <<mailto:d.serson14@imperial.ac.uk>d.serson14@imperial.ac.uk<mailto:d.serson14@imperial.ac.uk>> wrote: Hi Asim, I am glad that your simulation is now working. About your questions: 1. We have some work done on a filter for calculating Reynolds stresses as the simulation progresses, but it is not ready yet, and it would not provide all the statistics you want. Since you already have a lot of chk files, I suppose the best way would indeed be using a script to process all of them with FieldConvert. 2. Yes, this has been recently included in FieldConvert, using the new 'meanmode' module. 3. I just checked that, and apparently this is caused by a bug when using this module without fftw. This should be fixed soon, but as an alternative this module should work if you switch fftw on (just add <I PROPERTY="USEFFT" VALUE="FFTW"/> to you session file, if the code was compiled with support to fftw). 4. I think there is some work towards a developer guide, but I don't how advanced is the progress on that. I am sure Spencer will be able to provide you with more information on that. Cheers, Douglas ________________________________________ From: Asim Onder <<mailto:ceeao@nus.edu.sg>ceeao@nus.edu.sg<mailto:ceeao@nus.edu.sg>> Sent: 17 March 2016 09:10 To: Serson, Douglas; Sherwin, Spencer J Cc: nektar-users; Moxey, David Subject: Re: [Nektar-users] Parallel transposition not implemented yet for 3D-Homo-2D approach Hi Spencer, Douglas, Thanks to your suggestions I managed to get the turbulent regime for the oscillatory channel flow. I have now completed the DNS study for one case, and built up a large database with checkpoint (*chk) files. I would like to calculate turbulent statistics using this database, especially for second order terms, e.g. Reynolds stresses and turbulent dissipation, and third order terms, e.g. turbulent diffusion terms. However, I am a little bit confused how I could achieve this. I would appreciate if you could give some hints about the following: 1. The only way I could think of to calculate turbulent statistics is to write a simple bash script to iterate over chk files, and apply various existing/extended FieldConvert operations on individual chk files. This would require some additional storage to store the intermediate steps, and therefore would be a bit cumbersome. Would it be any simpler way directly doing this directly in Nektar++? 2. I have one homogeneous direction, for which I used Fourier expansions. I would like to apply spatial averaging over this homogeneous direction. Does Nektar++ already contain such functionality? 3. I want to use 'wss' in Fieldconvert module to calculate wall shear stress. However, it returns segmentation fault. Any ideas why it could be? 4. I was wondering if there is any introductory document for basic programming in Nektar++. User guide does not contain information about programming. It would be nice to have some additional information to Doxygen documentation. Thank you very much in advance for your feedback. Cheers, Asim On 02/15/2016 11:59 PM, Serson, Douglas wrote: Hi Asim, As Spencer mentioned, svv can help in stabilizing your solution. You can find information on how to set it up in the user guide (pages 92-93), but basically all you need to do is use: <I PROPERTY="SpectralVanishingViscosity" VALUE="True"/> You can also tune it by setting the parameters SVVCutoffRatio and SVVDiffCoeff, but I would suggest starting with the default parameters. Also, you can use the parameter IO_CFLSteps to output the CFL number. This way you can check if the time step you are using is appropriate. Cheers, Douglas From: Sherwin, Spencer J Sent: 14 February 2016 19:46 To: ceeao Cc: nektar-users; Serson, Douglas; Moxey, David Subject: Re: [Nektar-users] Parallel transposition not implemented yet for 3D-Homo-2D approach Hi Asim, Getting a flow through transition is very challenging since there is a strong localisation of shear and this can lead to aliasing issues which can then cause instabilities. Both Douglas and Dave have experienced this with recent simulations so I am cc’ing them to make some suggestions. I would be inclined to be using spectralhpdealiasing and svv. Hopefully Douglas can send you an example of how to switch this on. Cheers, Spencer. On 11 Feb 2016, at 10:32, ceeao<<mailto:ceeao@nus.edu.sg>ceeao@nus.edu.sg<mailto:ceeao@nus.edu.sg>> wrote: Hi Spencer, Nektar-Users, I followed the suggestion and coarsened the grid a bit. This way it worked impressively fast, but the flow is stable and remains laminar, as I didn't add any perturbations. I need to kick the transition to have turbulence. If I add white noise, even very low magnitude, conjugate gradient solver blows up again. I also tried adding some sinusoidal perturbations to boundary conditions, and again had troubles with CG. I don't really get CG's extreme sensitivity to perturbations. Any suggestion is much appreciated. Thanks in advance. Cheers, Asim On 02/08/2016 04:48 PM, Sherwin, Spencer J wrote: HI Asim, How many parallel cores are you running on. Sometime starting up these flows can be tricky especially if you are immediately jumping to a high Reynolds number. Have you tried first starting the flow at a Lower Reynolds number? Also 100 x 200 is quite a few elements in the x-y plane. Remember the polynomial order adds in more points on top of the mesh discretisation. I would perhaps recommend trying a smaller mesh to see how that goes first. Actually I note there is a file called TurbChFl_3D1H.xml in the ~/Nektar/Solvers/IncNavierStokesSolver/Examples directory which might be worth looking at. I think this was a mesh used in Ale Bolis’ thesis which you can find under: <http://wwwf.imperial.ac.uk/ssherw/spectralhp/papers/PhDThesis/Bolis_Thesis.pdf>http://wwwf.imperial.ac.uk/ssherw/spectralhp/papers/PhDThesis/Bolis_Thesis.pdf Cheers, Spencer. On 1 Feb 2016, at 07:01, ceeao<mailto:ceeao@nus.edu.sg><ceeao@nus.edu.sg><mailto:ceeao@nus.edu.sg> wrote: Hi Spencer, Thank you for the quick reply and suggestion. I switched indeed to 3D homo 1D case and this time I have problems with the divergence of linear solvers. I refined the grid in the channel flow example to 100x200x64 in x-y-z directions, and left everything else the same. When I employ the default global system solver "IterativeStaticCond" with this setup, I get divergence: "Exceeded maximum number of iterations (5000)". I checked the initial fields and mesh in Paraview, everything seems to be normal. I also tried the "LowEnergyBlock" preconditioner, and apparently this one is valid only in sheer 3D cases. My knowledge in iterative solvers for hp-Fem is minimal. Therefore, I was wondering if you could suggest maybe a robust option that at least converge. My concern is getting some rough estimates for the speed of Nektar++ in my oscillating channel flow problem. If the speed will be promising, I will switch to Nektar++ from OpenFOAM, as OpenFOAM is low-order and not really suitable for DNS. Thanks again in advance. Cheers, Asim On 01/31/2016 11:53 PM, Sherwin, Spencer J wrote: Hi Asim, I think your conclusions is correct. We did some early implementation into the 2D Homogeneous expansion but have not pulled it all the way through since we did not have a full project on this topic. We have however kept the existing code running through our regression test. For now I would perhaps suggest you try the 3D homo 1D approach for your runs since you can use parallelisation in that code. Cheers, Spencer. On 29 Jan 2016, at 04:00, ceeao<mailto:ceeao@nus.edu.sg><ceeao@nus.edu.sg><mailto:ceeao@nus.edu.sg> wrote: Dear all, I just installed the library, and need to simulate DNS of a channel flow with oscillating pressure gradient. As I have two homogeneous directions I applied standard Fourier discretization in these directions. It seems like this case is not parallelized yet, and I got the error in the subject. I was wondering if I'm overlooking something. If not, are there maybe any plans in the future to include parallelization of 2D FFT's? Thank you in advance. Best, Asim Onder Research Fellow National University of Singapore ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. _______________________________________________ Nektar-users mailing list <mailto:Nektar-users@imperial.ac.uk>Nektar-users@imperial.ac.uk<mailto:Nektar-users@imperial.ac.uk> <https://mailman.ic.ac.uk/mailman/listinfo/nektar-users>https://mailman.ic.ac.uk/mailman/listinfo/nektar-users Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ <mailto:s.sherwin@imperial.ac.uk>s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ <mailto:s.sherwin@imperial.ac.uk>s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ <mailto:s.sherwin@imperial.ac.uk>s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ <mailto:s.sherwin@imperial.ac.uk>s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ <mailto:s.sherwin@imperial.ac.uk>s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ <mailto:s.sherwin@imperial.ac.uk>s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you.
Hi Asim, About these issues: 1- I am not sure what could be happening. I tried these same steps with one of my cases and it works fine. Are the stresses exactly zero or just very small? 2- I found a bug in the wss module which is probably causing this. I think I was able to fix it (in branch fix/WssParallel), so this will probably be sorted out soon. 3- It is not possible to calculate the forces using FieldConvert. However, there is a filter (AeroForces) that does this as the simulation progresses. If you use your solution as the initial condition for a simulation with just one (or maybe even zero) time step, you should be able to use this filter to obtain the forces. Cheers, Douglas ________________________________ From: Asim Onder <ceeao@nus.edu.sg> Sent: 27 May 2016 13:30:20 To: Serson, Douglas; Sherwin, Spencer J Cc: nektar-users; Moxey, David Subject: Re: [Nektar-users] Parallel transposition not implemented yet for 3D-Homo-2D approach Hi Douglas, I now have some troubles with postprocessing shear stress on a curved wall. Appreciate if you could provide some suggestions whenever you would be able to look at this. (just to remind my case: 3DH1D DNS of a channel flow with a wavy bottom. Around 16000 quad elements with NUMMODES="6" TYPE="MODIFIED" in xy, and HomModesZ=1080 in z direction.) There are three issues: 1. First, I tried to calculate the mean shear stress from a mean mode which is extracted using Fieldconvert, e.g.,: mpirun -np 48 FieldConvert -v -m meanmode config.xml fields/field_128.chk meanFields/mean_128.chk FieldConvert -v -m wss:bnd=1:addnormals=1 config.xml meanFields/mean_128.chk wssMean_128.chk Then, I converted the result to vtu and vizualized it. Normals to the surface are correctly calculated. However, shear stresses are zero, which is of course not true. 2. I also need to calculate shear-stress fluctuations and their probability distribution function. To this end, I partitioned the mesh into 10 units, and tried to extract instantaneous shear stress on the wall: FieldConvert -v -m wss:bnd=1:addnormals=1 config_xml/P0000001.xml fields/field_128.chk wallShearStress/wss_128_p01.chk and received a segmentation fault: ProcessWSS: Calculating wall shear stress... /var/spool/PBS/mom_priv/jobs/1511560.wlm01.SC: line 56: 40234 Segmentation fault FieldConvert -v -m wss:bnd=1:addnormals=1 config_xml/P0000001.xml fields/field_128.chk wallShearStress/wss_128_p01.chk I've put the log file for this one in attachment. 3. Finally, I need to calculate the drag force on the wall. wss returns shear stresses, along with pressure and normal vectors. I can use this information, and apply simple midpoint rule for integration. However, normal vectors seem to be normalized, hence of unity length, therefore, area information is missing. I was wondering if there is any easy way to extract the area of surface elements. Thanks you very much in advance for the feedback. Cheers, Asim On 05/06/2016 07:37 PM, Serson, Douglas wrote: Hi Asim, Thank you for reporting this postprocessing issue. We did find an operation that was consuming an unreasonable amount of time. I already fixed that, and eventually this fix will be available in the master branch. If you want to test it before then, it is in the branch fix/FC3DH1Defficiency. Cheers, Douglas ________________________________ From: Asim Onder <ceeao@nus.edu.sg><mailto:ceeao@nus.edu.sg> Sent: 04 May 2016 07:07:45 To: Sherwin, Spencer J Cc: Serson, Douglas; nektar-users; Moxey, David Subject: Re: [Nektar-users] Parallel transposition not implemented yet for 3D-Homo-2D approach Hi Spencer, please find the requested .xml file in attachment. Cheers, Asim On 05/03/2016 10:15 PM, Sherwin, Spencer J wrote: Hi Asim, The si what I was afraid of. I do not know why your case is still taking so long. Can you send me the .xml file to have a look at. Thanks, Spencer. On 3 May 2016, at 10:17, Asim Onder <<mailto:ceeao@nus.edu.sg>ceeao@nus.edu.sg<mailto:ceeao@nus.edu.sg>> wrote: Hi Spencer, I have partitioned my mesh into 48 pieces, and applied Fieldconvert -v as you have suggested: FieldConvert -v -m vorticity config_xml/P0000000.xml config_10.chk vorPart_10.vtu The end of the output file looks like this: ...... InputXml session reader CPU Time: 0.036654s InputXml mesh graph setup CPU Time: 0.0949287s InputXml setexpansion CPU Time: 77.2126s InputXml setexpansion CPU Time: 5.66e-07s Collection Implemenation for Quadrilateral ( 6 6 ) for ngeoms = 648 BwdTrans: StdMat (0.000246074, 0.000233187, 6.70384e-05, 0.000117696) IProductWRTBase: StdMat (0.000299029, 0.000254921, 8.57054e-05, 0.000164536) IProductWRTDerivBase: StdMat (0.00147705, 0.000787602, 0.000234766, 0.000425167) PhysDeriv: SumFac (0.000471923, 0.000315652, 0.000244664, 0.000203107) InputXml set first exp CPU Time: 7453.92s InputXml CPU Time: 7531.26s Processing input fld file InputFld CPU Time: 211.413s ProcessVorticity: Calculating vorticity... OutputVtk: Writing file... Writing: "vorPart_12.vtu" Written file: vorPart_12.vtu Total CPU Time: 8059.78s "InputXml set first exp" seems to be consuming the most time. What would this correspond? Thanks, Asim On 05/02/2016 06:12 PM, Sherwin, Spencer J wrote: Hi Asim, Douglas may have the most experience with this size calculation. I have to admit it is a bit of a challenge currently. One suggestion is that you run with the -v option on FieldConvert so we can see where it is taking most of the time. I have had problems in 3D with simply readying the xml file and so we had done a bit of restricting to help this. I do not know if this might still be problem with the Homogeneous 1D code. If this is the case then in the 3D code what we sometimes do is repartition the mesh using FieldConvert - - part-only=16 config.xml out.fld This will produce a directory called config_xml with files called P0000000.xml P0000001.xml I then try and process one file at a time ./FieldConvert config_xml/P000000.xml config_10.chk out.vtu I wonder if this would help break up the work and hopefully speed up the processing? Cheers, Spencer. On 27 Apr 2016, at 08:36, Asim Onder <ceeao@nus.edu.sg<mailto:ceeao@nus.edu.sg>> wrote: Hi Douglas, Spencer, Thanks for the suggestions, the problem is gone. I'm now a little concerned about the postprocessing of this relatively big case. For example, calculating vorticity from a snapshot in a chk folder takes several hours if I use a command like this: mpirun -np 720 FieldConvert -m vorticity config.xml config_10.chk vorticity_10.chk Changing the #procs didn't help too much. If I try to process individual domains one by one with something like this: FieldConvert --nprocs 72 --procid 1 -m vorticity config.xml config_10.chk vorticity_10.vtu It still seem to take hours. Just for a comparison: for this case, one time step of IncNavierStokesSolver takes around 5 seconds on 1440 procs with an initialization time of around 5mins. I guess I'm doing something wrong. Would you have any suggestions on this? Thank a lot in advance. Cheers, Asim On 04/22/2016 03:22 AM, Serson, Douglas wrote: Hi Asim, One thing I noticed about your setup is that HomModesZ / npz = 3. This should always be an even number, so you will need to change your parameters (for example using npz = 180). I am surprised no error message with this information was displayed, but this will definitely make your simulation crash. In terms of IO, as Spencer said you can pre-partition the mesh. However, I don't think this will make much difference since your mesh is 2D, and therefore does not use much memory anyway. As for the checkpoint file, as far as I know each process only tries to load one file at a time. If your checkpoint was obtained from a simulation with many cores, each file will be relatively small, and you should not have any problems. Cheers, Douglas ________________________________ From: Sherwin, Spencer J Sent: 21 April 2016 19:34 To: Asim Onder Cc: Serson, Douglas; nektar-users; Moxey, David Subject: Re: [Nektar-users] Parallel transposition not implemented yet for 3D-Homo-2D approach Hi Asim, In fully 3D simulations we tend to pre-partition the mesh and this can help with memory usage on a single core. To do this you can run the solver with the option - - part-only=’no of partitions of 2D planes’ Then instead of running with a file.xml you give the solver file_xml directory. However I am not sure whether this is all working with the 2.3 D code. Douglas is this how you start any of your runs? Cheers, Spencer. On 20 Apr 2016, at 05:48, Asim Onder <<mailto:ceeao@nus.edu.sg>ceeao@nus.edu.sg<mailto:ceeao@nus.edu.sg>> wrote: Hi Douglas, thanks for the feedback. I was aware of --npz parallelization but was using a small number, not 1/2 or 1/4 of HomModesZ. Increasing npz really helped. I still have to try GlobalSysSoln. Now I face a memory problem for another case. The simulation runs out of memory when starting from a checkpoint file. Here is a little bit information about this case: - Mesh is made of around 16000 quad elements with p=5, i.e., NUMMODES="6" TYPE="MODIFIED" in xy, and HomModesZ=1080 in z direction. - I'm trying to run this case on 60 computing nodes each equipped with 24 processors, and a memory of 105 gb. In total, it makes 1440 procs, and 6300gb memory. - Execution command: mpirun -np 1440 IncNavierStokesSolver --npz 360 config.xml I was wondering if the memory usage of the application is scaling on different cores during IO, or using only one core. If it is only one core, than if it exceeds 105gb, it crushes I guess. Would you have maybe any suggestion/comment on this? Thanks, Asim On 04/13/2016 12:12 AM, Serson, Douglas wrote: Hi Asim, Concerning your questions: 1- Are you using the command line argument --npz? This is very important for obtaining an efficient parallel performance with the Fourier expansion, since it defines the number of partitions in the z-direction. If it is not set, only the xy plane will be partitioned and the parallelism will saturate quickly. I suggest initially setting npz to 1/2 or 1/4 of HomModesZ (note that nprocs must be a multiple of npz, since nprocs/npz is the number of partitions in the xy plane). Also, depending on your particular case and the number of partitions you have in the xy plane, your simulation may benefit from using a direct solver for the linear systems. This can be activated by adding '-I GlobalSysSoln=XxtMultiLevelStaticCond' to the command line. This is usually more efficient for a small number of partitions, but considering the large size of your problem it might be worth trying it. 2- I am not sure what could be causing that. I suppose it would help if you could send the exact commands you are using to run FieldConvert. Cheers, Douglas ________________________________ From: Asim Onder <mailto:ceeao@nus.edu.sg> <ceeao@nus.edu.sg><mailto:ceeao@nus.edu.sg> Sent: 12 April 2016 06:42 To: Sherwin, Spencer J; Serson, Douglas Cc: nektar-users; Moxey, David Subject: Re: [Nektar-users] Parallel transposition not implemented yet for 3D-Homo-2D approach Dear Spencer, Douglas, Nektar-users, I'm involved now in testing of a local petascale supercomputer, and for some quite limited time I can use several thousand processors for my DNS study. My test case is oscillating flow over a rippled bed. I build up a dense unstructured grid with p=6 quadrilateral elements in x-y, and Fourier expansions in z directions. In total I have circa half billion dofs per variable. I would have a few questions about this relatively large case: 1. I noticed that scaling gets inefficient after around 500 procs, let's say parallel efficiency goes below 80%. I was wondering if you would have any general suggestions to tune the configurations for a better scaling. 2. Postprocessing vorticity and Q criterion is not working for this case. At the of the execution Fieldconvert writes some small files without the field data. What could be the reason for this? Thanks you in advance for your suggestions. Cheers, Asim On 03/21/2016 04:16 AM, Sherwin, Spencer J wrote: Hi Asim, To follow-up on Douglas’ comment we are trying to get more organised to sort out a developers guide. We are also holding a user meeting in June. If you were able to make this we could also try and have a session on getting you going on the developmental side of things. Cheers, Spencer. On 17 Mar 2016, at 14:58, Serson, Douglas <<mailto:d.serson14@imperial.ac.uk>d.serson14@imperial.ac.uk<mailto:d.serson14@imperial.ac.uk>> wrote: Hi Asim, I am glad that your simulation is now working. About your questions: 1. We have some work done on a filter for calculating Reynolds stresses as the simulation progresses, but it is not ready yet, and it would not provide all the statistics you want. Since you already have a lot of chk files, I suppose the best way would indeed be using a script to process all of them with FieldConvert. 2. Yes, this has been recently included in FieldConvert, using the new 'meanmode' module. 3. I just checked that, and apparently this is caused by a bug when using this module without fftw. This should be fixed soon, but as an alternative this module should work if you switch fftw on (just add <I PROPERTY="USEFFT" VALUE="FFTW"/> to you session file, if the code was compiled with support to fftw). 4. I think there is some work towards a developer guide, but I don't how advanced is the progress on that. I am sure Spencer will be able to provide you with more information on that. Cheers, Douglas ________________________________________ From: Asim Onder <<mailto:ceeao@nus.edu.sg>ceeao@nus.edu.sg<mailto:ceeao@nus.edu.sg>> Sent: 17 March 2016 09:10 To: Serson, Douglas; Sherwin, Spencer J Cc: nektar-users; Moxey, David Subject: Re: [Nektar-users] Parallel transposition not implemented yet for 3D-Homo-2D approach Hi Spencer, Douglas, Thanks to your suggestions I managed to get the turbulent regime for the oscillatory channel flow. I have now completed the DNS study for one case, and built up a large database with checkpoint (*chk) files. I would like to calculate turbulent statistics using this database, especially for second order terms, e.g. Reynolds stresses and turbulent dissipation, and third order terms, e.g. turbulent diffusion terms. However, I am a little bit confused how I could achieve this. I would appreciate if you could give some hints about the following: 1. The only way I could think of to calculate turbulent statistics is to write a simple bash script to iterate over chk files, and apply various existing/extended FieldConvert operations on individual chk files. This would require some additional storage to store the intermediate steps, and therefore would be a bit cumbersome. Would it be any simpler way directly doing this directly in Nektar++? 2. I have one homogeneous direction, for which I used Fourier expansions. I would like to apply spatial averaging over this homogeneous direction. Does Nektar++ already contain such functionality? 3. I want to use 'wss' in Fieldconvert module to calculate wall shear stress. However, it returns segmentation fault. Any ideas why it could be? 4. I was wondering if there is any introductory document for basic programming in Nektar++. User guide does not contain information about programming. It would be nice to have some additional information to Doxygen documentation. Thank you very much in advance for your feedback. Cheers, Asim On 02/15/2016 11:59 PM, Serson, Douglas wrote: Hi Asim, As Spencer mentioned, svv can help in stabilizing your solution. You can find information on how to set it up in the user guide (pages 92-93), but basically all you need to do is use: <I PROPERTY="SpectralVanishingViscosity" VALUE="True"/> You can also tune it by setting the parameters SVVCutoffRatio and SVVDiffCoeff, but I would suggest starting with the default parameters. Also, you can use the parameter IO_CFLSteps to output the CFL number. This way you can check if the time step you are using is appropriate. Cheers, Douglas From: Sherwin, Spencer J Sent: 14 February 2016 19:46 To: ceeao Cc: nektar-users; Serson, Douglas; Moxey, David Subject: Re: [Nektar-users] Parallel transposition not implemented yet for 3D-Homo-2D approach Hi Asim, Getting a flow through transition is very challenging since there is a strong localisation of shear and this can lead to aliasing issues which can then cause instabilities. Both Douglas and Dave have experienced this with recent simulations so I am cc’ing them to make some suggestions. I would be inclined to be using spectralhpdealiasing and svv. Hopefully Douglas can send you an example of how to switch this on. Cheers, Spencer. On 11 Feb 2016, at 10:32, ceeao<<mailto:ceeao@nus.edu.sg>ceeao@nus.edu.sg<mailto:ceeao@nus.edu.sg>> wrote: Hi Spencer, Nektar-Users, I followed the suggestion and coarsened the grid a bit. This way it worked impressively fast, but the flow is stable and remains laminar, as I didn't add any perturbations. I need to kick the transition to have turbulence. If I add white noise, even very low magnitude, conjugate gradient solver blows up again. I also tried adding some sinusoidal perturbations to boundary conditions, and again had troubles with CG. I don't really get CG's extreme sensitivity to perturbations. Any suggestion is much appreciated. Thanks in advance. Cheers, Asim On 02/08/2016 04:48 PM, Sherwin, Spencer J wrote: HI Asim, How many parallel cores are you running on. Sometime starting up these flows can be tricky especially if you are immediately jumping to a high Reynolds number. Have you tried first starting the flow at a Lower Reynolds number? Also 100 x 200 is quite a few elements in the x-y plane. Remember the polynomial order adds in more points on top of the mesh discretisation. I would perhaps recommend trying a smaller mesh to see how that goes first. Actually I note there is a file called TurbChFl_3D1H.xml in the ~/Nektar/Solvers/IncNavierStokesSolver/Examples directory which might be worth looking at. I think this was a mesh used in Ale Bolis’ thesis which you can find under: <http://wwwf.imperial.ac.uk/ssherw/spectralhp/papers/PhDThesis/Bolis_Thesis.pdf>http://wwwf.imperial.ac.uk/ssherw/spectralhp/papers/PhDThesis/Bolis_Thesis.pdf Cheers, Spencer. On 1 Feb 2016, at 07:01, ceeao<mailto:ceeao@nus.edu.sg><ceeao@nus.edu.sg><mailto:ceeao@nus.edu.sg> wrote: Hi Spencer, Thank you for the quick reply and suggestion. I switched indeed to 3D homo 1D case and this time I have problems with the divergence of linear solvers. I refined the grid in the channel flow example to 100x200x64 in x-y-z directions, and left everything else the same. When I employ the default global system solver "IterativeStaticCond" with this setup, I get divergence: "Exceeded maximum number of iterations (5000)". I checked the initial fields and mesh in Paraview, everything seems to be normal. I also tried the "LowEnergyBlock" preconditioner, and apparently this one is valid only in sheer 3D cases. My knowledge in iterative solvers for hp-Fem is minimal. Therefore, I was wondering if you could suggest maybe a robust option that at least converge. My concern is getting some rough estimates for the speed of Nektar++ in my oscillating channel flow problem. If the speed will be promising, I will switch to Nektar++ from OpenFOAM, as OpenFOAM is low-order and not really suitable for DNS. Thanks again in advance. Cheers, Asim On 01/31/2016 11:53 PM, Sherwin, Spencer J wrote: Hi Asim, I think your conclusions is correct. We did some early implementation into the 2D Homogeneous expansion but have not pulled it all the way through since we did not have a full project on this topic. We have however kept the existing code running through our regression test. For now I would perhaps suggest you try the 3D homo 1D approach for your runs since you can use parallelisation in that code. Cheers, Spencer. On 29 Jan 2016, at 04:00, ceeao<mailto:ceeao@nus.edu.sg><ceeao@nus.edu.sg><mailto:ceeao@nus.edu.sg> wrote: Dear all, I just installed the library, and need to simulate DNS of a channel flow with oscillating pressure gradient. As I have two homogeneous directions I applied standard Fourier discretization in these directions. It seems like this case is not parallelized yet, and I got the error in the subject. I was wondering if I'm overlooking something. If not, are there maybe any plans in the future to include parallelization of 2D FFT's? Thank you in advance. Best, Asim Onder Research Fellow National University of Singapore ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. _______________________________________________ Nektar-users mailing list <mailto:Nektar-users@imperial.ac.uk>Nektar-users@imperial.ac.uk<mailto:Nektar-users@imperial.ac.uk> <https://mailman.ic.ac.uk/mailman/listinfo/nektar-users>https://mailman.ic.ac.uk/mailman/listinfo/nektar-users Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ <mailto:s.sherwin@imperial.ac.uk>s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ <mailto:s.sherwin@imperial.ac.uk>s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ <mailto:s.sherwin@imperial.ac.uk>s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ <mailto:s.sherwin@imperial.ac.uk>s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ <mailto:s.sherwin@imperial.ac.uk>s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ <mailto:s.sherwin@imperial.ac.uk>s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you.
Hi Douglas, Thanks for the fix, and the tip for AeroForces-filter. Regarding the issues: 1. Shear stress is exactly zero everywhere. 2. After the fix, the segmentation fault is indeed gone. Unfortunately, this one also returned zero shear-stress field for my case. Another problem I noticed is that one of the surface normals (norm_y) is -1 everywhere, which is not true. Just to provide more information, I've attached snapshots of surface normals along with xml file of the partition. Thanks, Asim On 05/28/2016 12:14 AM, Serson, Douglas wrote: Hi Asim, About these issues: 1- I am not sure what could be happening. I tried these same steps with one of my cases and it works fine. Are the stresses exactly zero or just very small? 2- I found a bug in the wss module which is probably causing this. I think I was able to fix it (in branch fix/WssParallel), so this will probably be sorted out soon. 3- It is not possible to calculate the forces using FieldConvert. However, there is a filter (AeroForces) that does this as the simulation progresses. If you use your solution as the initial condition for a simulation with just one (or maybe even zero) time step, you should be able to use this filter to obtain the forces. Cheers, Douglas ________________________________ From: Asim Onder <ceeao@nus.edu.sg><mailto:ceeao@nus.edu.sg> Sent: 27 May 2016 13:30:20 To: Serson, Douglas; Sherwin, Spencer J Cc: nektar-users; Moxey, David Subject: Re: [Nektar-users] Parallel transposition not implemented yet for 3D-Homo-2D approach Hi Douglas, I now have some troubles with postprocessing shear stress on a curved wall. Appreciate if you could provide some suggestions whenever you would be able to look at this. (just to remind my case: 3DH1D DNS of a channel flow with a wavy bottom. Around 16000 quad elements with NUMMODES="6" TYPE="MODIFIED" in xy, and HomModesZ=1080 in z direction.) There are three issues: 1. First, I tried to calculate the mean shear stress from a mean mode which is extracted using Fieldconvert, e.g.,: mpirun -np 48 FieldConvert -v -m meanmode config.xml fields/field_128.chk meanFields/mean_128.chk FieldConvert -v -m wss:bnd=1:addnormals=1 config.xml meanFields/mean_128.chk wssMean_128.chk Then, I converted the result to vtu and vizualized it. Normals to the surface are correctly calculated. However, shear stresses are zero, which is of course not true. 2. I also need to calculate shear-stress fluctuations and their probability distribution function. To this end, I partitioned the mesh into 10 units, and tried to extract instantaneous shear stress on the wall: FieldConvert -v -m wss:bnd=1:addnormals=1 config_xml/P0000001.xml fields/field_128.chk wallShearStress/wss_128_p01.chk and received a segmentation fault: ProcessWSS: Calculating wall shear stress... /var/spool/PBS/mom_priv/jobs/1511560.wlm01.SC: line 56: 40234 Segmentation fault FieldConvert -v -m wss:bnd=1:addnormals=1 config_xml/P0000001.xml fields/field_128.chk wallShearStress/wss_128_p01.chk I've put the log file for this one in attachment. 3. Finally, I need to calculate the drag force on the wall. wss returns shear stresses, along with pressure and normal vectors. I can use this information, and apply simple midpoint rule for integration. However, normal vectors seem to be normalized, hence of unity length, therefore, area information is missing. I was wondering if there is any easy way to extract the area of surface elements. Thanks you very much in advance for the feedback. Cheers, Asim On 05/06/2016 07:37 PM, Serson, Douglas wrote: Hi Asim, Thank you for reporting this postprocessing issue. We did find an operation that was consuming an unreasonable amount of time. I already fixed that, and eventually this fix will be available in the master branch. If you want to test it before then, it is in the branch fix/FC3DH1Defficiency. Cheers, Douglas ________________________________ From: Asim Onder <ceeao@nus.edu.sg><mailto:ceeao@nus.edu.sg> Sent: 04 May 2016 07:07:45 To: Sherwin, Spencer J Cc: Serson, Douglas; nektar-users; Moxey, David Subject: Re: [Nektar-users] Parallel transposition not implemented yet for 3D-Homo-2D approach Hi Spencer, please find the requested .xml file in attachment. Cheers, Asim On 05/03/2016 10:15 PM, Sherwin, Spencer J wrote: Hi Asim, The si what I was afraid of. I do not know why your case is still taking so long. Can you send me the .xml file to have a look at. Thanks, Spencer. On 3 May 2016, at 10:17, Asim Onder <ceeao@nus.edu.sg<mailto:ceeao@nus.edu.sg>> wrote: Hi Spencer, I have partitioned my mesh into 48 pieces, and applied Fieldconvert -v as you have suggested: FieldConvert -v -m vorticity config_xml/P0000000.xml config_10.chk vorPart_10.vtu The end of the output file looks like this: ...... InputXml session reader CPU Time: 0.036654s InputXml mesh graph setup CPU Time: 0.0949287s InputXml setexpansion CPU Time: 77.2126s InputXml setexpansion CPU Time: 5.66e-07s Collection Implemenation for Quadrilateral ( 6 6 ) for ngeoms = 648 BwdTrans: StdMat (0.000246074, 0.000233187, 6.70384e-05, 0.000117696) IProductWRTBase: StdMat (0.000299029, 0.000254921, 8.57054e-05, 0.000164536) IProductWRTDerivBase: StdMat (0.00147705, 0.000787602, 0.000234766, 0.000425167) PhysDeriv: SumFac (0.000471923, 0.000315652, 0.000244664, 0.000203107) InputXml set first exp CPU Time: 7453.92s InputXml CPU Time: 7531.26s Processing input fld file InputFld CPU Time: 211.413s ProcessVorticity: Calculating vorticity... OutputVtk: Writing file... Writing: "vorPart_12.vtu" Written file: vorPart_12.vtu Total CPU Time: 8059.78s "InputXml set first exp" seems to be consuming the most time. What would this correspond? Thanks, Asim On 05/02/2016 06:12 PM, Sherwin, Spencer J wrote: Hi Asim, Douglas may have the most experience with this size calculation. I have to admit it is a bit of a challenge currently. One suggestion is that you run with the -v option on FieldConvert so we can see where it is taking most of the time. I have had problems in 3D with simply readying the xml file and so we had done a bit of restricting to help this. I do not know if this might still be problem with the Homogeneous 1D code. If this is the case then in the 3D code what we sometimes do is repartition the mesh using FieldConvert - - part-only=16 config.xml out.fld This will produce a directory called config_xml with files called P0000000.xml P0000001.xml I then try and process one file at a time ./FieldConvert config_xml/P000000.xml config_10.chk out.vtu I wonder if this would help break up the work and hopefully speed up the processing? Cheers, Spencer. On 27 Apr 2016, at 08:36, Asim Onder <<mailto:ceeao@nus.edu.sg>ceeao@nus.edu.sg<mailto:ceeao@nus.edu.sg>> wrote: Hi Douglas, Spencer, Thanks for the suggestions, the problem is gone. I'm now a little concerned about the postprocessing of this relatively big case. For example, calculating vorticity from a snapshot in a chk folder takes several hours if I use a command like this: mpirun -np 720 FieldConvert -m vorticity config.xml config_10.chk vorticity_10.chk Changing the #procs didn't help too much. If I try to process individual domains one by one with something like this: FieldConvert --nprocs 72 --procid 1 -m vorticity config.xml config_10.chk vorticity_10.vtu It still seem to take hours. Just for a comparison: for this case, one time step of IncNavierStokesSolver takes around 5 seconds on 1440 procs with an initialization time of around 5mins. I guess I'm doing something wrong. Would you have any suggestions on this? Thank a lot in advance. Cheers, Asim On 04/22/2016 03:22 AM, Serson, Douglas wrote: Hi Asim, One thing I noticed about your setup is that HomModesZ / npz = 3. This should always be an even number, so you will need to change your parameters (for example using npz = 180). I am surprised no error message with this information was displayed, but this will definitely make your simulation crash. In terms of IO, as Spencer said you can pre-partition the mesh. However, I don't think this will make much difference since your mesh is 2D, and therefore does not use much memory anyway. As for the checkpoint file, as far as I know each process only tries to load one file at a time. If your checkpoint was obtained from a simulation with many cores, each file will be relatively small, and you should not have any problems. Cheers, Douglas ________________________________ From: Sherwin, Spencer J Sent: 21 April 2016 19:34 To: Asim Onder Cc: Serson, Douglas; nektar-users; Moxey, David Subject: Re: [Nektar-users] Parallel transposition not implemented yet for 3D-Homo-2D approach Hi Asim, In fully 3D simulations we tend to pre-partition the mesh and this can help with memory usage on a single core. To do this you can run the solver with the option - - part-only=’no of partitions of 2D planes’ Then instead of running with a file.xml you give the solver file_xml directory. However I am not sure whether this is all working with the 2.3 D code. Douglas is this how you start any of your runs? Cheers, Spencer. On 20 Apr 2016, at 05:48, Asim Onder <<mailto:ceeao@nus.edu.sg>ceeao@nus.edu.sg<mailto:ceeao@nus.edu.sg>> wrote: Hi Douglas, thanks for the feedback. I was aware of --npz parallelization but was using a small number, not 1/2 or 1/4 of HomModesZ. Increasing npz really helped. I still have to try GlobalSysSoln. Now I face a memory problem for another case. The simulation runs out of memory when starting from a checkpoint file. Here is a little bit information about this case: - Mesh is made of around 16000 quad elements with p=5, i.e., NUMMODES="6" TYPE="MODIFIED" in xy, and HomModesZ=1080 in z direction. - I'm trying to run this case on 60 computing nodes each equipped with 24 processors, and a memory of 105 gb. In total, it makes 1440 procs, and 6300gb memory. - Execution command: mpirun -np 1440 IncNavierStokesSolver --npz 360 config.xml I was wondering if the memory usage of the application is scaling on different cores during IO, or using only one core. If it is only one core, than if it exceeds 105gb, it crushes I guess. Would you have maybe any suggestion/comment on this? Thanks, Asim On 04/13/2016 12:12 AM, Serson, Douglas wrote: Hi Asim, Concerning your questions: 1- Are you using the command line argument --npz? This is very important for obtaining an efficient parallel performance with the Fourier expansion, since it defines the number of partitions in the z-direction. If it is not set, only the xy plane will be partitioned and the parallelism will saturate quickly. I suggest initially setting npz to 1/2 or 1/4 of HomModesZ (note that nprocs must be a multiple of npz, since nprocs/npz is the number of partitions in the xy plane). Also, depending on your particular case and the number of partitions you have in the xy plane, your simulation may benefit from using a direct solver for the linear systems. This can be activated by adding '-I GlobalSysSoln=XxtMultiLevelStaticCond' to the command line. This is usually more efficient for a small number of partitions, but considering the large size of your problem it might be worth trying it. 2- I am not sure what could be causing that. I suppose it would help if you could send the exact commands you are using to run FieldConvert. Cheers, Douglas ________________________________ From: Asim Onder <mailto:ceeao@nus.edu.sg> <ceeao@nus.edu.sg><mailto:ceeao@nus.edu.sg> Sent: 12 April 2016 06:42 To: Sherwin, Spencer J; Serson, Douglas Cc: nektar-users; Moxey, David Subject: Re: [Nektar-users] Parallel transposition not implemented yet for 3D-Homo-2D approach Dear Spencer, Douglas, Nektar-users, I'm involved now in testing of a local petascale supercomputer, and for some quite limited time I can use several thousand processors for my DNS study. My test case is oscillating flow over a rippled bed. I build up a dense unstructured grid with p=6 quadrilateral elements in x-y, and Fourier expansions in z directions. In total I have circa half billion dofs per variable. I would have a few questions about this relatively large case: 1. I noticed that scaling gets inefficient after around 500 procs, let's say parallel efficiency goes below 80%. I was wondering if you would have any general suggestions to tune the configurations for a better scaling. 2. Postprocessing vorticity and Q criterion is not working for this case. At the of the execution Fieldconvert writes some small files without the field data. What could be the reason for this? Thanks you in advance for your suggestions. Cheers, Asim On 03/21/2016 04:16 AM, Sherwin, Spencer J wrote: Hi Asim, To follow-up on Douglas’ comment we are trying to get more organised to sort out a developers guide. We are also holding a user meeting in June. If you were able to make this we could also try and have a session on getting you going on the developmental side of things. Cheers, Spencer. On 17 Mar 2016, at 14:58, Serson, Douglas <<mailto:d.serson14@imperial.ac.uk>d.serson14@imperial.ac.uk<mailto:d.serson14@imperial.ac.uk>> wrote: Hi Asim, I am glad that your simulation is now working. About your questions: 1. We have some work done on a filter for calculating Reynolds stresses as the simulation progresses, but it is not ready yet, and it would not provide all the statistics you want. Since you already have a lot of chk files, I suppose the best way would indeed be using a script to process all of them with FieldConvert. 2. Yes, this has been recently included in FieldConvert, using the new 'meanmode' module. 3. I just checked that, and apparently this is caused by a bug when using this module without fftw. This should be fixed soon, but as an alternative this module should work if you switch fftw on (just add <I PROPERTY="USEFFT" VALUE="FFTW"/> to you session file, if the code was compiled with support to fftw). 4. I think there is some work towards a developer guide, but I don't how advanced is the progress on that. I am sure Spencer will be able to provide you with more information on that. Cheers, Douglas ________________________________________ From: Asim Onder <<mailto:ceeao@nus.edu.sg>ceeao@nus.edu.sg<mailto:ceeao@nus.edu.sg>> Sent: 17 March 2016 09:10 To: Serson, Douglas; Sherwin, Spencer J Cc: nektar-users; Moxey, David Subject: Re: [Nektar-users] Parallel transposition not implemented yet for 3D-Homo-2D approach Hi Spencer, Douglas, Thanks to your suggestions I managed to get the turbulent regime for the oscillatory channel flow. I have now completed the DNS study for one case, and built up a large database with checkpoint (*chk) files. I would like to calculate turbulent statistics using this database, especially for second order terms, e.g. Reynolds stresses and turbulent dissipation, and third order terms, e.g. turbulent diffusion terms. However, I am a little bit confused how I could achieve this. I would appreciate if you could give some hints about the following: 1. The only way I could think of to calculate turbulent statistics is to write a simple bash script to iterate over chk files, and apply various existing/extended FieldConvert operations on individual chk files. This would require some additional storage to store the intermediate steps, and therefore would be a bit cumbersome. Would it be any simpler way directly doing this directly in Nektar++? 2. I have one homogeneous direction, for which I used Fourier expansions. I would like to apply spatial averaging over this homogeneous direction. Does Nektar++ already contain such functionality? 3. I want to use 'wss' in Fieldconvert module to calculate wall shear stress. However, it returns segmentation fault. Any ideas why it could be? 4. I was wondering if there is any introductory document for basic programming in Nektar++. User guide does not contain information about programming. It would be nice to have some additional information to Doxygen documentation. Thank you very much in advance for your feedback. Cheers, Asim On 02/15/2016 11:59 PM, Serson, Douglas wrote: Hi Asim, As Spencer mentioned, svv can help in stabilizing your solution. You can find information on how to set it up in the user guide (pages 92-93), but basically all you need to do is use: <I PROPERTY="SpectralVanishingViscosity" VALUE="True"/> You can also tune it by setting the parameters SVVCutoffRatio and SVVDiffCoeff, but I would suggest starting with the default parameters. Also, you can use the parameter IO_CFLSteps to output the CFL number. This way you can check if the time step you are using is appropriate. Cheers, Douglas From: Sherwin, Spencer J Sent: 14 February 2016 19:46 To: ceeao Cc: nektar-users; Serson, Douglas; Moxey, David Subject: Re: [Nektar-users] Parallel transposition not implemented yet for 3D-Homo-2D approach Hi Asim, Getting a flow through transition is very challenging since there is a strong localisation of shear and this can lead to aliasing issues which can then cause instabilities. Both Douglas and Dave have experienced this with recent simulations so I am cc’ing them to make some suggestions. I would be inclined to be using spectralhpdealiasing and svv. Hopefully Douglas can send you an example of how to switch this on. Cheers, Spencer. On 11 Feb 2016, at 10:32, ceeao<<mailto:ceeao@nus.edu.sg>ceeao@nus.edu.sg<mailto:ceeao@nus.edu.sg>> wrote: Hi Spencer, Nektar-Users, I followed the suggestion and coarsened the grid a bit. This way it worked impressively fast, but the flow is stable and remains laminar, as I didn't add any perturbations. I need to kick the transition to have turbulence. If I add white noise, even very low magnitude, conjugate gradient solver blows up again. I also tried adding some sinusoidal perturbations to boundary conditions, and again had troubles with CG. I don't really get CG's extreme sensitivity to perturbations. Any suggestion is much appreciated. Thanks in advance. Cheers, Asim On 02/08/2016 04:48 PM, Sherwin, Spencer J wrote: HI Asim, How many parallel cores are you running on. Sometime starting up these flows can be tricky especially if you are immediately jumping to a high Reynolds number. Have you tried first starting the flow at a Lower Reynolds number? Also 100 x 200 is quite a few elements in the x-y plane. Remember the polynomial order adds in more points on top of the mesh discretisation. I would perhaps recommend trying a smaller mesh to see how that goes first. Actually I note there is a file called TurbChFl_3D1H.xml in the ~/Nektar/Solvers/IncNavierStokesSolver/Examples directory which might be worth looking at. I think this was a mesh used in Ale Bolis’ thesis which you can find under: <http://wwwf.imperial.ac.uk/ssherw/spectralhp/papers/PhDThesis/Bolis_Thesis.pdf>http://wwwf.imperial.ac.uk/ssherw/spectralhp/papers/PhDThesis/Bolis_Thesis.pdf Cheers, Spencer. On 1 Feb 2016, at 07:01, ceeao<mailto:ceeao@nus.edu.sg><ceeao@nus.edu.sg><mailto:ceeao@nus.edu.sg> wrote: Hi Spencer, Thank you for the quick reply and suggestion. I switched indeed to 3D homo 1D case and this time I have problems with the divergence of linear solvers. I refined the grid in the channel flow example to 100x200x64 in x-y-z directions, and left everything else the same. When I employ the default global system solver "IterativeStaticCond" with this setup, I get divergence: "Exceeded maximum number of iterations (5000)". I checked the initial fields and mesh in Paraview, everything seems to be normal. I also tried the "LowEnergyBlock" preconditioner, and apparently this one is valid only in sheer 3D cases. My knowledge in iterative solvers for hp-Fem is minimal. Therefore, I was wondering if you could suggest maybe a robust option that at least converge. My concern is getting some rough estimates for the speed of Nektar++ in my oscillating channel flow problem. If the speed will be promising, I will switch to Nektar++ from OpenFOAM, as OpenFOAM is low-order and not really suitable for DNS. Thanks again in advance. Cheers, Asim On 01/31/2016 11:53 PM, Sherwin, Spencer J wrote: Hi Asim, I think your conclusions is correct. We did some early implementation into the 2D Homogeneous expansion but have not pulled it all the way through since we did not have a full project on this topic. We have however kept the existing code running through our regression test. For now I would perhaps suggest you try the 3D homo 1D approach for your runs since you can use parallelisation in that code. Cheers, Spencer. On 29 Jan 2016, at 04:00, ceeao<mailto:ceeao@nus.edu.sg><ceeao@nus.edu.sg><mailto:ceeao@nus.edu.sg> wrote: Dear all, I just installed the library, and need to simulate DNS of a channel flow with oscillating pressure gradient. As I have two homogeneous directions I applied standard Fourier discretization in these directions. It seems like this case is not parallelized yet, and I got the error in the subject. I was wondering if I'm overlooking something. If not, are there maybe any plans in the future to include parallelization of 2D FFT's? Thank you in advance. Best, Asim Onder Research Fellow National University of Singapore ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. _______________________________________________ Nektar-users mailing list <mailto:Nektar-users@imperial.ac.uk>Nektar-users@imperial.ac.uk<mailto:Nektar-users@imperial.ac.uk> <https://mailman.ic.ac.uk/mailman/listinfo/nektar-users>https://mailman.ic.ac.uk/mailman/listinfo/nektar-users Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ <mailto:s.sherwin@imperial.ac.uk>s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ <mailto:s.sherwin@imperial.ac.uk>s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ <mailto:s.sherwin@imperial.ac.uk>s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ <mailto:s.sherwin@imperial.ac.uk>s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ <mailto:s.sherwin@imperial.ac.uk>s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ <mailto:s.sherwin@imperial.ac.uk>s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ <mailto:s.sherwin@imperial.ac.uk>s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you.
Hi Asim, I was able to obtain the correct Norm_y using your file. Since I don't have a .fld for your case, I used the extract module instead of wss, but both of them generate the normals in the same way, so the result should be the same. The complete process I used is: 1 - Obtain a .xml file for the boundary: NekMesh -m extract:surf=1 P0000003.xml P0000003_bnd.xml (note that here surf is the composite, not the boundary region. In your case, it happens that both are the same) 2 - Use FieldConvert with the wss module: FieldConvert -m wss:bnd=1:addnormals P0000003.xml P0000003.fld P0000003_bnd.fld 3 - Convert to vtu: FieldConvert P0000003_bnd.xml P0000003_bnd_b1.fld P0000003_bnd.vtu Is this how you are processing your results? Cheers, Douglas ________________________________ From: Asim Onder <ceeao@nus.edu.sg> Sent: 30 May 2016 14:12:03 To: Serson, Douglas; Sherwin, Spencer J Cc: nektar-users; Moxey, David Subject: Re: [Nektar-users] Parallel transposition not implemented yet for 3D-Homo-2D approach Hi Douglas, Thanks for the fix, and the tip for AeroForces-filter. Regarding the issues: 1. Shear stress is exactly zero everywhere. 2. After the fix, the segmentation fault is indeed gone. Unfortunately, this one also returned zero shear-stress field for my case. Another problem I noticed is that one of the surface normals (norm_y) is -1 everywhere, which is not true. Just to provide more information, I've attached snapshots of surface normals along with xml file of the partition. Thanks, Asim On 05/28/2016 12:14 AM, Serson, Douglas wrote: Hi Asim, About these issues: 1- I am not sure what could be happening. I tried these same steps with one of my cases and it works fine. Are the stresses exactly zero or just very small? 2- I found a bug in the wss module which is probably causing this. I think I was able to fix it (in branch fix/WssParallel), so this will probably be sorted out soon. 3- It is not possible to calculate the forces using FieldConvert. However, there is a filter (AeroForces) that does this as the simulation progresses. If you use your solution as the initial condition for a simulation with just one (or maybe even zero) time step, you should be able to use this filter to obtain the forces. Cheers, Douglas ________________________________ From: Asim Onder <ceeao@nus.edu.sg><mailto:ceeao@nus.edu.sg> Sent: 27 May 2016 13:30:20 To: Serson, Douglas; Sherwin, Spencer J Cc: nektar-users; Moxey, David Subject: Re: [Nektar-users] Parallel transposition not implemented yet for 3D-Homo-2D approach Hi Douglas, I now have some troubles with postprocessing shear stress on a curved wall. Appreciate if you could provide some suggestions whenever you would be able to look at this. (just to remind my case: 3DH1D DNS of a channel flow with a wavy bottom. Around 16000 quad elements with NUMMODES="6" TYPE="MODIFIED" in xy, and HomModesZ=1080 in z direction.) There are three issues: 1. First, I tried to calculate the mean shear stress from a mean mode which is extracted using Fieldconvert, e.g.,: mpirun -np 48 FieldConvert -v -m meanmode config.xml fields/field_128.chk meanFields/mean_128.chk FieldConvert -v -m wss:bnd=1:addnormals=1 config.xml meanFields/mean_128.chk wssMean_128.chk Then, I converted the result to vtu and vizualized it. Normals to the surface are correctly calculated. However, shear stresses are zero, which is of course not true. 2. I also need to calculate shear-stress fluctuations and their probability distribution function. To this end, I partitioned the mesh into 10 units, and tried to extract instantaneous shear stress on the wall: FieldConvert -v -m wss:bnd=1:addnormals=1 config_xml/P0000001.xml fields/field_128.chk wallShearStress/wss_128_p01.chk and received a segmentation fault: ProcessWSS: Calculating wall shear stress... /var/spool/PBS/mom_priv/jobs/1511560.wlm01.SC: line 56: 40234 Segmentation fault FieldConvert -v -m wss:bnd=1:addnormals=1 config_xml/P0000001.xml fields/field_128.chk wallShearStress/wss_128_p01.chk I've put the log file for this one in attachment. 3. Finally, I need to calculate the drag force on the wall. wss returns shear stresses, along with pressure and normal vectors. I can use this information, and apply simple midpoint rule for integration. However, normal vectors seem to be normalized, hence of unity length, therefore, area information is missing. I was wondering if there is any easy way to extract the area of surface elements. Thanks you very much in advance for the feedback. Cheers, Asim On 05/06/2016 07:37 PM, Serson, Douglas wrote: Hi Asim, Thank you for reporting this postprocessing issue. We did find an operation that was consuming an unreasonable amount of time. I already fixed that, and eventually this fix will be available in the master branch. If you want to test it before then, it is in the branch fix/FC3DH1Defficiency. Cheers, Douglas ________________________________ From: Asim Onder <ceeao@nus.edu.sg><mailto:ceeao@nus.edu.sg> Sent: 04 May 2016 07:07:45 To: Sherwin, Spencer J Cc: Serson, Douglas; nektar-users; Moxey, David Subject: Re: [Nektar-users] Parallel transposition not implemented yet for 3D-Homo-2D approach Hi Spencer, please find the requested .xml file in attachment. Cheers, Asim On 05/03/2016 10:15 PM, Sherwin, Spencer J wrote: Hi Asim, The si what I was afraid of. I do not know why your case is still taking so long. Can you send me the .xml file to have a look at. Thanks, Spencer. On 3 May 2016, at 10:17, Asim Onder <ceeao@nus.edu.sg<mailto:ceeao@nus.edu.sg>> wrote: Hi Spencer, I have partitioned my mesh into 48 pieces, and applied Fieldconvert -v as you have suggested: FieldConvert -v -m vorticity config_xml/P0000000.xml config_10.chk vorPart_10.vtu The end of the output file looks like this: ...... InputXml session reader CPU Time: 0.036654s InputXml mesh graph setup CPU Time: 0.0949287s InputXml setexpansion CPU Time: 77.2126s InputXml setexpansion CPU Time: 5.66e-07s Collection Implemenation for Quadrilateral ( 6 6 ) for ngeoms = 648 BwdTrans: StdMat (0.000246074, 0.000233187, 6.70384e-05, 0.000117696) IProductWRTBase: StdMat (0.000299029, 0.000254921, 8.57054e-05, 0.000164536) IProductWRTDerivBase: StdMat (0.00147705, 0.000787602, 0.000234766, 0.000425167) PhysDeriv: SumFac (0.000471923, 0.000315652, 0.000244664, 0.000203107) InputXml set first exp CPU Time: 7453.92s InputXml CPU Time: 7531.26s Processing input fld file InputFld CPU Time: 211.413s ProcessVorticity: Calculating vorticity... OutputVtk: Writing file... Writing: "vorPart_12.vtu" Written file: vorPart_12.vtu Total CPU Time: 8059.78s "InputXml set first exp" seems to be consuming the most time. What would this correspond? Thanks, Asim On 05/02/2016 06:12 PM, Sherwin, Spencer J wrote: Hi Asim, Douglas may have the most experience with this size calculation. I have to admit it is a bit of a challenge currently. One suggestion is that you run with the -v option on FieldConvert so we can see where it is taking most of the time. I have had problems in 3D with simply readying the xml file and so we had done a bit of restricting to help this. I do not know if this might still be problem with the Homogeneous 1D code. If this is the case then in the 3D code what we sometimes do is repartition the mesh using FieldConvert - - part-only=16 config.xml out.fld This will produce a directory called config_xml with files called P0000000.xml P0000001.xml I then try and process one file at a time ./FieldConvert config_xml/P000000.xml config_10.chk out.vtu I wonder if this would help break up the work and hopefully speed up the processing? Cheers, Spencer. On 27 Apr 2016, at 08:36, Asim Onder <<mailto:ceeao@nus.edu.sg>ceeao@nus.edu.sg<mailto:ceeao@nus.edu.sg>> wrote: Hi Douglas, Spencer, Thanks for the suggestions, the problem is gone. I'm now a little concerned about the postprocessing of this relatively big case. For example, calculating vorticity from a snapshot in a chk folder takes several hours if I use a command like this: mpirun -np 720 FieldConvert -m vorticity config.xml config_10.chk vorticity_10.chk Changing the #procs didn't help too much. If I try to process individual domains one by one with something like this: FieldConvert --nprocs 72 --procid 1 -m vorticity config.xml config_10.chk vorticity_10.vtu It still seem to take hours. Just for a comparison: for this case, one time step of IncNavierStokesSolver takes around 5 seconds on 1440 procs with an initialization time of around 5mins. I guess I'm doing something wrong. Would you have any suggestions on this? Thank a lot in advance. Cheers, Asim On 04/22/2016 03:22 AM, Serson, Douglas wrote: Hi Asim, One thing I noticed about your setup is that HomModesZ / npz = 3. This should always be an even number, so you will need to change your parameters (for example using npz = 180). I am surprised no error message with this information was displayed, but this will definitely make your simulation crash. In terms of IO, as Spencer said you can pre-partition the mesh. However, I don't think this will make much difference since your mesh is 2D, and therefore does not use much memory anyway. As for the checkpoint file, as far as I know each process only tries to load one file at a time. If your checkpoint was obtained from a simulation with many cores, each file will be relatively small, and you should not have any problems. Cheers, Douglas ________________________________ From: Sherwin, Spencer J Sent: 21 April 2016 19:34 To: Asim Onder Cc: Serson, Douglas; nektar-users; Moxey, David Subject: Re: [Nektar-users] Parallel transposition not implemented yet for 3D-Homo-2D approach Hi Asim, In fully 3D simulations we tend to pre-partition the mesh and this can help with memory usage on a single core. To do this you can run the solver with the option - - part-only=’no of partitions of 2D planes’ Then instead of running with a file.xml you give the solver file_xml directory. However I am not sure whether this is all working with the 2.3 D code. Douglas is this how you start any of your runs? Cheers, Spencer. On 20 Apr 2016, at 05:48, Asim Onder <<mailto:ceeao@nus.edu.sg>ceeao@nus.edu.sg<mailto:ceeao@nus.edu.sg>> wrote: Hi Douglas, thanks for the feedback. I was aware of --npz parallelization but was using a small number, not 1/2 or 1/4 of HomModesZ. Increasing npz really helped. I still have to try GlobalSysSoln. Now I face a memory problem for another case. The simulation runs out of memory when starting from a checkpoint file. Here is a little bit information about this case: - Mesh is made of around 16000 quad elements with p=5, i.e., NUMMODES="6" TYPE="MODIFIED" in xy, and HomModesZ=1080 in z direction. - I'm trying to run this case on 60 computing nodes each equipped with 24 processors, and a memory of 105 gb. In total, it makes 1440 procs, and 6300gb memory. - Execution command: mpirun -np 1440 IncNavierStokesSolver --npz 360 config.xml I was wondering if the memory usage of the application is scaling on different cores during IO, or using only one core. If it is only one core, than if it exceeds 105gb, it crushes I guess. Would you have maybe any suggestion/comment on this? Thanks, Asim On 04/13/2016 12:12 AM, Serson, Douglas wrote: Hi Asim, Concerning your questions: 1- Are you using the command line argument --npz? This is very important for obtaining an efficient parallel performance with the Fourier expansion, since it defines the number of partitions in the z-direction. If it is not set, only the xy plane will be partitioned and the parallelism will saturate quickly. I suggest initially setting npz to 1/2 or 1/4 of HomModesZ (note that nprocs must be a multiple of npz, since nprocs/npz is the number of partitions in the xy plane). Also, depending on your particular case and the number of partitions you have in the xy plane, your simulation may benefit from using a direct solver for the linear systems. This can be activated by adding '-I GlobalSysSoln=XxtMultiLevelStaticCond' to the command line. This is usually more efficient for a small number of partitions, but considering the large size of your problem it might be worth trying it. 2- I am not sure what could be causing that. I suppose it would help if you could send the exact commands you are using to run FieldConvert. Cheers, Douglas ________________________________ From: Asim Onder <mailto:ceeao@nus.edu.sg> <ceeao@nus.edu.sg><mailto:ceeao@nus.edu.sg> Sent: 12 April 2016 06:42 To: Sherwin, Spencer J; Serson, Douglas Cc: nektar-users; Moxey, David Subject: Re: [Nektar-users] Parallel transposition not implemented yet for 3D-Homo-2D approach Dear Spencer, Douglas, Nektar-users, I'm involved now in testing of a local petascale supercomputer, and for some quite limited time I can use several thousand processors for my DNS study. My test case is oscillating flow over a rippled bed. I build up a dense unstructured grid with p=6 quadrilateral elements in x-y, and Fourier expansions in z directions. In total I have circa half billion dofs per variable. I would have a few questions about this relatively large case: 1. I noticed that scaling gets inefficient after around 500 procs, let's say parallel efficiency goes below 80%. I was wondering if you would have any general suggestions to tune the configurations for a better scaling. 2. Postprocessing vorticity and Q criterion is not working for this case. At the of the execution Fieldconvert writes some small files without the field data. What could be the reason for this? Thanks you in advance for your suggestions. Cheers, Asim On 03/21/2016 04:16 AM, Sherwin, Spencer J wrote: Hi Asim, To follow-up on Douglas’ comment we are trying to get more organised to sort out a developers guide. We are also holding a user meeting in June. If you were able to make this we could also try and have a session on getting you going on the developmental side of things. Cheers, Spencer. On 17 Mar 2016, at 14:58, Serson, Douglas <<mailto:d.serson14@imperial.ac.uk>d.serson14@imperial.ac.uk<mailto:d.serson14@imperial.ac.uk>> wrote: Hi Asim, I am glad that your simulation is now working. About your questions: 1. We have some work done on a filter for calculating Reynolds stresses as the simulation progresses, but it is not ready yet, and it would not provide all the statistics you want. Since you already have a lot of chk files, I suppose the best way would indeed be using a script to process all of them with FieldConvert. 2. Yes, this has been recently included in FieldConvert, using the new 'meanmode' module. 3. I just checked that, and apparently this is caused by a bug when using this module without fftw. This should be fixed soon, but as an alternative this module should work if you switch fftw on (just add <I PROPERTY="USEFFT" VALUE="FFTW"/> to you session file, if the code was compiled with support to fftw). 4. I think there is some work towards a developer guide, but I don't how advanced is the progress on that. I am sure Spencer will be able to provide you with more information on that. Cheers, Douglas ________________________________________ From: Asim Onder <<mailto:ceeao@nus.edu.sg>ceeao@nus.edu.sg<mailto:ceeao@nus.edu.sg>> Sent: 17 March 2016 09:10 To: Serson, Douglas; Sherwin, Spencer J Cc: nektar-users; Moxey, David Subject: Re: [Nektar-users] Parallel transposition not implemented yet for 3D-Homo-2D approach Hi Spencer, Douglas, Thanks to your suggestions I managed to get the turbulent regime for the oscillatory channel flow. I have now completed the DNS study for one case, and built up a large database with checkpoint (*chk) files. I would like to calculate turbulent statistics using this database, especially for second order terms, e.g. Reynolds stresses and turbulent dissipation, and third order terms, e.g. turbulent diffusion terms. However, I am a little bit confused how I could achieve this. I would appreciate if you could give some hints about the following: 1. The only way I could think of to calculate turbulent statistics is to write a simple bash script to iterate over chk files, and apply various existing/extended FieldConvert operations on individual chk files. This would require some additional storage to store the intermediate steps, and therefore would be a bit cumbersome. Would it be any simpler way directly doing this directly in Nektar++? 2. I have one homogeneous direction, for which I used Fourier expansions. I would like to apply spatial averaging over this homogeneous direction. Does Nektar++ already contain such functionality? 3. I want to use 'wss' in Fieldconvert module to calculate wall shear stress. However, it returns segmentation fault. Any ideas why it could be? 4. I was wondering if there is any introductory document for basic programming in Nektar++. User guide does not contain information about programming. It would be nice to have some additional information to Doxygen documentation. Thank you very much in advance for your feedback. Cheers, Asim On 02/15/2016 11:59 PM, Serson, Douglas wrote: Hi Asim, As Spencer mentioned, svv can help in stabilizing your solution. You can find information on how to set it up in the user guide (pages 92-93), but basically all you need to do is use: <I PROPERTY="SpectralVanishingViscosity" VALUE="True"/> You can also tune it by setting the parameters SVVCutoffRatio and SVVDiffCoeff, but I would suggest starting with the default parameters. Also, you can use the parameter IO_CFLSteps to output the CFL number. This way you can check if the time step you are using is appropriate. Cheers, Douglas From: Sherwin, Spencer J Sent: 14 February 2016 19:46 To: ceeao Cc: nektar-users; Serson, Douglas; Moxey, David Subject: Re: [Nektar-users] Parallel transposition not implemented yet for 3D-Homo-2D approach Hi Asim, Getting a flow through transition is very challenging since there is a strong localisation of shear and this can lead to aliasing issues which can then cause instabilities. Both Douglas and Dave have experienced this with recent simulations so I am cc’ing them to make some suggestions. I would be inclined to be using spectralhpdealiasing and svv. Hopefully Douglas can send you an example of how to switch this on. Cheers, Spencer. On 11 Feb 2016, at 10:32, ceeao<<mailto:ceeao@nus.edu.sg>ceeao@nus.edu.sg<mailto:ceeao@nus.edu.sg>> wrote: Hi Spencer, Nektar-Users, I followed the suggestion and coarsened the grid a bit. This way it worked impressively fast, but the flow is stable and remains laminar, as I didn't add any perturbations. I need to kick the transition to have turbulence. If I add white noise, even very low magnitude, conjugate gradient solver blows up again. I also tried adding some sinusoidal perturbations to boundary conditions, and again had troubles with CG. I don't really get CG's extreme sensitivity to perturbations. Any suggestion is much appreciated. Thanks in advance. Cheers, Asim On 02/08/2016 04:48 PM, Sherwin, Spencer J wrote: HI Asim, How many parallel cores are you running on. Sometime starting up these flows can be tricky especially if you are immediately jumping to a high Reynolds number. Have you tried first starting the flow at a Lower Reynolds number? Also 100 x 200 is quite a few elements in the x-y plane. Remember the polynomial order adds in more points on top of the mesh discretisation. I would perhaps recommend trying a smaller mesh to see how that goes first. Actually I note there is a file called TurbChFl_3D1H.xml in the ~/Nektar/Solvers/IncNavierStokesSolver/Examples directory which might be worth looking at. I think this was a mesh used in Ale Bolis’ thesis which you can find under: <http://wwwf.imperial.ac.uk/ssherw/spectralhp/papers/PhDThesis/Bolis_Thesis.pdf>http://wwwf.imperial.ac.uk/ssherw/spectralhp/papers/PhDThesis/Bolis_Thesis.pdf Cheers, Spencer. On 1 Feb 2016, at 07:01, ceeao<mailto:ceeao@nus.edu.sg><ceeao@nus.edu.sg><mailto:ceeao@nus.edu.sg> wrote: Hi Spencer, Thank you for the quick reply and suggestion. I switched indeed to 3D homo 1D case and this time I have problems with the divergence of linear solvers. I refined the grid in the channel flow example to 100x200x64 in x-y-z directions, and left everything else the same. When I employ the default global system solver "IterativeStaticCond" with this setup, I get divergence: "Exceeded maximum number of iterations (5000)". I checked the initial fields and mesh in Paraview, everything seems to be normal. I also tried the "LowEnergyBlock" preconditioner, and apparently this one is valid only in sheer 3D cases. My knowledge in iterative solvers for hp-Fem is minimal. Therefore, I was wondering if you could suggest maybe a robust option that at least converge. My concern is getting some rough estimates for the speed of Nektar++ in my oscillating channel flow problem. If the speed will be promising, I will switch to Nektar++ from OpenFOAM, as OpenFOAM is low-order and not really suitable for DNS. Thanks again in advance. Cheers, Asim On 01/31/2016 11:53 PM, Sherwin, Spencer J wrote: Hi Asim, I think your conclusions is correct. We did some early implementation into the 2D Homogeneous expansion but have not pulled it all the way through since we did not have a full project on this topic. We have however kept the existing code running through our regression test. For now I would perhaps suggest you try the 3D homo 1D approach for your runs since you can use parallelisation in that code. Cheers, Spencer. On 29 Jan 2016, at 04:00, ceeao<mailto:ceeao@nus.edu.sg><ceeao@nus.edu.sg><mailto:ceeao@nus.edu.sg> wrote: Dear all, I just installed the library, and need to simulate DNS of a channel flow with oscillating pressure gradient. As I have two homogeneous directions I applied standard Fourier discretization in these directions. It seems like this case is not parallelized yet, and I got the error in the subject. I was wondering if I'm overlooking something. If not, are there maybe any plans in the future to include parallelization of 2D FFT's? Thank you in advance. Best, Asim Onder Research Fellow National University of Singapore ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. _______________________________________________ Nektar-users mailing list <mailto:Nektar-users@imperial.ac.uk>Nektar-users@imperial.ac.uk<mailto:Nektar-users@imperial.ac.uk> <https://mailman.ic.ac.uk/mailman/listinfo/nektar-users>https://mailman.ic.ac.uk/mailman/listinfo/nektar-users Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ <mailto:s.sherwin@imperial.ac.uk>s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ <mailto:s.sherwin@imperial.ac.uk>s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ <mailto:s.sherwin@imperial.ac.uk>s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ <mailto:s.sherwin@imperial.ac.uk>s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ <mailto:s.sherwin@imperial.ac.uk>s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ <mailto:s.sherwin@imperial.ac.uk>s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ <mailto:s.sherwin@imperial.ac.uk>s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you.
Hi Asim, When the chk file is imported the possibility of having different partitioning is taken into account. You shouldn't have to worry about that. Could you run the following commands and send me the output from the last one? NekMesh -m extract:surf=1 P0000003.xml P0000003_bnd.xml FieldConvert -m wss:bnd=1:addnormals=1 P0000003.xml field_128.chk P0000003_bnd.chk FieldConvert -e P0000003_bnd.xml P0000003_bnd_b1.chk test.fld This will show the norms of each field including the wss and the normals. It might help us figure out in which step of the process things are going wrong. Cheers, Douglas ________________________________ From: Asim Onder <ceeao@nus.edu.sg> Sent: 30 May 2016 16:59:16 To: Serson, Douglas Cc: Sherwin, Spencer J; nektar-users; Moxey, David Subject: Re: [Nektar-users] Parallel transposition not implemented yet for 3D-Homo-2D approach Hi Douglas, The only difference I could see is that I don’t have a P0000003.fld file which corresponds to the partition P0000003.xml. I rather use the full field in a chk directory partitioned into 2160 units during simulation runtime (P0000000.fld….P0002159.fld): FieldConvert -m wss:bnd=1:addnormals=1 P0000003.xml field_128.chk P0000003_bnd.chk Is this maybe the source of trouble? FieldConvert —part-only=10 config.xml field_128.chk partitions the mesh into 10 pieces but the old fld files in a chk directory, which are created during the simulation, remains unchanged. How would I adjust the field data into a new partitioning of the mesh? Thanks, Asim On May 30, 2016, at 10:51 PM, Serson, Douglas <d.serson14@imperial.ac.uk<mailto:d.serson14@imperial.ac.uk>> wrote: Hi Asim, I was able to obtain the correct Norm_y using your file. Since I don't have a .fld for your case, I used the extract module instead of wss, but both of them generate the normals in the same way, so the result should be the same. The complete process I used is: 1 - Obtain a .xml file for the boundary: NekMesh -m extract:surf=1 P0000003.xml P0000003_bnd.xml (note that here surf is the composite, not the boundary region. In your case, it happens that both are the same) 2 - Use FieldConvert with the wss module: FieldConvert -m wss:bnd=1:addnormals P0000003.xml P0000003.fld P0000003_bnd.fld 3 - Convert to vtu: FieldConvert P0000003_bnd.xml P0000003_bnd_b1.fld P0000003_bnd.vtu Is this how you are processing your results? Cheers, Douglas ________________________________ From: Asim Onder <ceeao@nus.edu.sg<mailto:ceeao@nus.edu.sg>> Sent: 30 May 2016 14:12:03 To: Serson, Douglas; Sherwin, Spencer J Cc: nektar-users; Moxey, David Subject: Re: [Nektar-users] Parallel transposition not implemented yet for 3D-Homo-2D approach Hi Douglas, Thanks for the fix, and the tip for AeroForces-filter. Regarding the issues: 1. Shear stress is exactly zero everywhere. 2. After the fix, the segmentation fault is indeed gone. Unfortunately, this one also returned zero shear-stress field for my case. Another problem I noticed is that one of the surface normals (norm_y) is -1 everywhere, which is not true. Just to provide more information, I've attached snapshots of surface normals along with xml file of the partition. Thanks, Asim On 05/28/2016 12:14 AM, Serson, Douglas wrote: Hi Asim, About these issues: 1- I am not sure what could be happening. I tried these same steps with one of my cases and it works fine. Are the stresses exactly zero or just very small? 2- I found a bug in the wss module which is probably causing this. I think I was able to fix it (in branch fix/WssParallel), so this will probably be sorted out soon. 3- It is not possible to calculate the forces using FieldConvert. However, there is a filter (AeroForces) that does this as the simulation progresses. If you use your solution as the initial condition for a simulation with just one (or maybe even zero) time step, you should be able to use this filter to obtain the forces. Cheers, Douglas ________________________________ From: Asim Onder <ceeao@nus.edu.sg><mailto:ceeao@nus.edu.sg> Sent: 27 May 2016 13:30:20 To: Serson, Douglas; Sherwin, Spencer J Cc: nektar-users; Moxey, David Subject: Re: [Nektar-users] Parallel transposition not implemented yet for 3D-Homo-2D approach Hi Douglas, I now have some troubles with postprocessing shear stress on a curved wall. Appreciate if you could provide some suggestions whenever you would be able to look at this. (just to remind my case: 3DH1D DNS of a channel flow with a wavy bottom. Around 16000 quad elements with NUMMODES="6" TYPE="MODIFIED" in xy, and HomModesZ=1080 in z direction.) There are three issues: 1. First, I tried to calculate the mean shear stress from a mean mode which is extracted using Fieldconvert, e.g.,: mpirun -np 48 FieldConvert -v -m meanmode config.xml fields/field_128.chk meanFields/mean_128.chk FieldConvert -v -m wss:bnd=1:addnormals=1 config.xml meanFields/mean_128.chk wssMean_128.chk Then, I converted the result to vtu and vizualized it. Normals to the surface are correctly calculated. However, shear stresses are zero, which is of course not true. 2. I also need to calculate shear-stress fluctuations and their probability distribution function. To this end, I partitioned the mesh into 10 units, and tried to extract instantaneous shear stress on the wall: FieldConvert -v -m wss:bnd=1:addnormals=1 config_xml/P0000001.xml fields/field_128.chk wallShearStress/wss_128_p01.chk and received a segmentation fault: ProcessWSS: Calculating wall shear stress... /var/spool/PBS/mom_priv/jobs/1511560.wlm01.SC: line 56: 40234 Segmentation fault FieldConvert -v -m wss:bnd=1:addnormals=1 config_xml/P0000001.xml fields/field_128.chk wallShearStress/wss_128_p01.chk I've put the log file for this one in attachment. 3. Finally, I need to calculate the drag force on the wall. wss returns shear stresses, along with pressure and normal vectors. I can use this information, and apply simple midpoint rule for integration. However, normal vectors seem to be normalized, hence of unity length, therefore, area information is missing. I was wondering if there is any easy way to extract the area of surface elements. Thanks you very much in advance for the feedback. Cheers, Asim On 05/06/2016 07:37 PM, Serson, Douglas wrote: Hi Asim, Thank you for reporting this postprocessing issue. We did find an operation that was consuming an unreasonable amount of time. I already fixed that, and eventually this fix will be available in the master branch. If you want to test it before then, it is in the branch fix/FC3DH1Defficiency. Cheers, Douglas ________________________________ From: Asim Onder <ceeao@nus.edu.sg><mailto:ceeao@nus.edu.sg> Sent: 04 May 2016 07:07:45 To: Sherwin, Spencer J Cc: Serson, Douglas; nektar-users; Moxey, David Subject: Re: [Nektar-users] Parallel transposition not implemented yet for 3D-Homo-2D approach Hi Spencer, please find the requested .xml file in attachment. Cheers, Asim On 05/03/2016 10:15 PM, Sherwin, Spencer J wrote: Hi Asim, The si what I was afraid of. I do not know why your case is still taking so long. Can you send me the .xml file to have a look at. Thanks, Spencer. On 3 May 2016, at 10:17, Asim Onder <ceeao@nus.edu.sg<mailto:ceeao@nus.edu.sg>> wrote: Hi Spencer, I have partitioned my mesh into 48 pieces, and applied Fieldconvert -v as you have suggested: FieldConvert -v -m vorticity config_xml/P0000000.xml config_10.chk vorPart_10.vtu The end of the output file looks like this: ...... InputXml session reader CPU Time: 0.036654s InputXml mesh graph setup CPU Time: 0.0949287s InputXml setexpansion CPU Time: 77.2126s InputXml setexpansion CPU Time: 5.66e-07s Collection Implemenation for Quadrilateral ( 6 6 ) for ngeoms = 648 BwdTrans: StdMat (0.000246074, 0.000233187, 6.70384e-05, 0.000117696) IProductWRTBase: StdMat (0.000299029, 0.000254921, 8.57054e-05, 0.000164536) IProductWRTDerivBase: StdMat (0.00147705, 0.000787602, 0.000234766, 0.000425167) PhysDeriv: SumFac (0.000471923, 0.000315652, 0.000244664, 0.000203107) InputXml set first exp CPU Time: 7453.92s InputXml CPU Time: 7531.26s Processing input fld file InputFld CPU Time: 211.413s ProcessVorticity: Calculating vorticity... OutputVtk: Writing file... Writing: "vorPart_12.vtu" Written file: vorPart_12.vtu Total CPU Time: 8059.78s "InputXml set first exp" seems to be consuming the most time. What would this correspond? Thanks, Asim On 05/02/2016 06:12 PM, Sherwin, Spencer J wrote: Hi Asim, Douglas may have the most experience with this size calculation. I have to admit it is a bit of a challenge currently. One suggestion is that you run with the -v option on FieldConvert so we can see where it is taking most of the time. I have had problems in 3D with simply readying the xml file and so we had done a bit of restricting to help this. I do not know if this might still be problem with the Homogeneous 1D code. If this is the case then in the 3D code what we sometimes do is repartition the mesh using FieldConvert - - part-only=16 config.xml out.fld This will produce a directory called config_xml with files called P0000000.xml P0000001.xml I then try and process one file at a time ./FieldConvert config_xml/P000000.xml config_10.chk out.vtu I wonder if this would help break up the work and hopefully speed up the processing? Cheers, Spencer. On 27 Apr 2016, at 08:36, Asim Onder <<mailto:ceeao@nus.edu.sg>ceeao@nus.edu.sg<mailto:ceeao@nus.edu.sg>> wrote: Hi Douglas, Spencer, Thanks for the suggestions, the problem is gone. I'm now a little concerned about the postprocessing of this relatively big case. For example, calculating vorticity from a snapshot in a chk folder takes several hours if I use a command like this: mpirun -np 720 FieldConvert -m vorticity config.xml config_10.chk vorticity_10.chk Changing the #procs didn't help too much. If I try to process individual domains one by one with something like this: FieldConvert --nprocs 72 --procid 1 -m vorticity config.xml config_10.chk vorticity_10.vtu It still seem to take hours. Just for a comparison: for this case, one time step of IncNavierStokesSolver takes around 5 seconds on 1440 procs with an initialization time of around 5mins. I guess I'm doing something wrong. Would you have any suggestions on this? Thank a lot in advance. Cheers, Asim On 04/22/2016 03:22 AM, Serson, Douglas wrote: Hi Asim, One thing I noticed about your setup is that HomModesZ / npz = 3. This should always be an even number, so you will need to change your parameters (for example using npz = 180). I am surprised no error message with this information was displayed, but this will definitely make your simulation crash. In terms of IO, as Spencer said you can pre-partition the mesh. However, I don't think this will make much difference since your mesh is 2D, and therefore does not use much memory anyway. As for the checkpoint file, as far as I know each process only tries to load one file at a time. If your checkpoint was obtained from a simulation with many cores, each file will be relatively small, and you should not have any problems. Cheers, Douglas ________________________________ From: Sherwin, Spencer J Sent: 21 April 2016 19:34 To: Asim Onder Cc: Serson, Douglas; nektar-users; Moxey, David Subject: Re: [Nektar-users] Parallel transposition not implemented yet for 3D-Homo-2D approach Hi Asim, In fully 3D simulations we tend to pre-partition the mesh and this can help with memory usage on a single core. To do this you can run the solver with the option - - part-only=’no of partitions of 2D planes’ Then instead of running with a file.xml you give the solver file_xml directory. However I am not sure whether this is all working with the 2.3 D code. Douglas is this how you start any of your runs? Cheers, Spencer. On 20 Apr 2016, at 05:48, Asim Onder <<mailto:ceeao@nus.edu.sg>ceeao@nus.edu.sg<mailto:ceeao@nus.edu.sg>> wrote: Hi Douglas, thanks for the feedback. I was aware of --npz parallelization but was using a small number, not 1/2 or 1/4 of HomModesZ. Increasing npz really helped. I still have to try GlobalSysSoln. Now I face a memory problem for another case. The simulation runs out of memory when starting from a checkpoint file. Here is a little bit information about this case: - Mesh is made of around 16000 quad elements with p=5, i.e., NUMMODES="6" TYPE="MODIFIED" in xy, and HomModesZ=1080 in z direction. - I'm trying to run this case on 60 computing nodes each equipped with 24 processors, and a memory of 105 gb. In total, it makes 1440 procs, and 6300gb memory. - Execution command: mpirun -np 1440 IncNavierStokesSolver --npz 360 config.xml I was wondering if the memory usage of the application is scaling on different cores during IO, or using only one core. If it is only one core, than if it exceeds 105gb, it crushes I guess. Would you have maybe any suggestion/comment on this? Thanks, Asim On 04/13/2016 12:12 AM, Serson, Douglas wrote: Hi Asim, Concerning your questions: 1- Are you using the command line argument --npz? This is very important for obtaining an efficient parallel performance with the Fourier expansion, since it defines the number of partitions in the z-direction. If it is not set, only the xy plane will be partitioned and the parallelism will saturate quickly. I suggest initially setting npz to 1/2 or 1/4 of HomModesZ (note that nprocs must be a multiple of npz, since nprocs/npz is the number of partitions in the xy plane). Also, depending on your particular case and the number of partitions you have in the xy plane, your simulation may benefit from using a direct solver for the linear systems. This can be activated by adding '-I GlobalSysSoln=XxtMultiLevelStaticCond' to the command line. This is usually more efficient for a small number of partitions, but considering the large size of your problem it might be worth trying it. 2- I am not sure what could be causing that. I suppose it would help if you could send the exact commands you are using to run FieldConvert. Cheers, Douglas ________________________________ From: Asim Onder <mailto:ceeao@nus.edu.sg> <ceeao@nus.edu.sg><mailto:ceeao@nus.edu.sg> Sent: 12 April 2016 06:42 To: Sherwin, Spencer J; Serson, Douglas Cc: nektar-users; Moxey, David Subject: Re: [Nektar-users] Parallel transposition not implemented yet for 3D-Homo-2D approach Dear Spencer, Douglas, Nektar-users, I'm involved now in testing of a local petascale supercomputer, and for some quite limited time I can use several thousand processors for my DNS study. My test case is oscillating flow over a rippled bed. I build up a dense unstructured grid with p=6 quadrilateral elements in x-y, and Fourier expansions in z directions. In total I have circa half billion dofs per variable. I would have a few questions about this relatively large case: 1. I noticed that scaling gets inefficient after around 500 procs, let's say parallel efficiency goes below 80%. I was wondering if you would have any general suggestions to tune the configurations for a better scaling. 2. Postprocessing vorticity and Q criterion is not working for this case. At the of the execution Fieldconvert writes some small files without the field data. What could be the reason for this? Thanks you in advance for your suggestions. Cheers, Asim On 03/21/2016 04:16 AM, Sherwin, Spencer J wrote: Hi Asim, To follow-up on Douglas’ comment we are trying to get more organised to sort out a developers guide. We are also holding a user meeting in June. If you were able to make this we could also try and have a session on getting you going on the developmental side of things. Cheers, Spencer. On 17 Mar 2016, at 14:58, Serson, Douglas <<mailto:d.serson14@imperial.ac.uk>d.serson14@imperial.ac.uk<mailto:d.serson14@imperial.ac.uk>> wrote: Hi Asim, I am glad that your simulation is now working. About your questions: 1. We have some work done on a filter for calculating Reynolds stresses as the simulation progresses, but it is not ready yet, and it would not provide all the statistics you want. Since you already have a lot of chk files, I suppose the best way would indeed be using a script to process all of them with FieldConvert. 2. Yes, this has been recently included in FieldConvert, using the new 'meanmode' module. 3. I just checked that, and apparently this is caused by a bug when using this module without fftw. This should be fixed soon, but as an alternative this module should work if you switch fftw on (just add <I PROPERTY="USEFFT" VALUE="FFTW"/> to you session file, if the code was compiled with support to fftw). 4. I think there is some work towards a developer guide, but I don't how advanced is the progress on that. I am sure Spencer will be able to provide you with more information on that. Cheers, Douglas ________________________________________ From: Asim Onder <<mailto:ceeao@nus.edu.sg>ceeao@nus.edu.sg<mailto:ceeao@nus.edu.sg>> Sent: 17 March 2016 09:10 To: Serson, Douglas; Sherwin, Spencer J Cc: nektar-users; Moxey, David Subject: Re: [Nektar-users] Parallel transposition not implemented yet for 3D-Homo-2D approach Hi Spencer, Douglas, Thanks to your suggestions I managed to get the turbulent regime for the oscillatory channel flow. I have now completed the DNS study for one case, and built up a large database with checkpoint (*chk) files. I would like to calculate turbulent statistics using this database, especially for second order terms, e.g. Reynolds stresses and turbulent dissipation, and third order terms, e.g. turbulent diffusion terms. However, I am a little bit confused how I could achieve this. I would appreciate if you could give some hints about the following: 1. The only way I could think of to calculate turbulent statistics is to write a simple bash script to iterate over chk files, and apply various existing/extended FieldConvert operations on individual chk files. This would require some additional storage to store the intermediate steps, and therefore would be a bit cumbersome. Would it be any simpler way directly doing this directly in Nektar++? 2. I have one homogeneous direction, for which I used Fourier expansions. I would like to apply spatial averaging over this homogeneous direction. Does Nektar++ already contain such functionality? 3. I want to use 'wss' in Fieldconvert module to calculate wall shear stress. However, it returns segmentation fault. Any ideas why it could be? 4. I was wondering if there is any introductory document for basic programming in Nektar++. User guide does not contain information about programming. It would be nice to have some additional information to Doxygen documentation. Thank you very much in advance for your feedback. Cheers, Asim On 02/15/2016 11:59 PM, Serson, Douglas wrote: Hi Asim, As Spencer mentioned, svv can help in stabilizing your solution. You can find information on how to set it up in the user guide (pages 92-93), but basically all you need to do is use: <I PROPERTY="SpectralVanishingViscosity" VALUE="True"/> You can also tune it by setting the parameters SVVCutoffRatio and SVVDiffCoeff, but I would suggest starting with the default parameters. Also, you can use the parameter IO_CFLSteps to output the CFL number. This way you can check if the time step you are using is appropriate. Cheers, Douglas From: Sherwin, Spencer J Sent: 14 February 2016 19:46 To: ceeao Cc: nektar-users; Serson, Douglas; Moxey, David Subject: Re: [Nektar-users] Parallel transposition not implemented yet for 3D-Homo-2D approach Hi Asim, Getting a flow through transition is very challenging since there is a strong localisation of shear and this can lead to aliasing issues which can then cause instabilities. Both Douglas and Dave have experienced this with recent simulations so I am cc’ing them to make some suggestions. I would be inclined to be using spectralhpdealiasing and svv. Hopefully Douglas can send you an example of how to switch this on. Cheers, Spencer. On 11 Feb 2016, at 10:32, ceeao<<mailto:ceeao@nus.edu.sg>ceeao@nus.edu.sg<mailto:ceeao@nus.edu.sg>> wrote: Hi Spencer, Nektar-Users, I followed the suggestion and coarsened the grid a bit. This way it worked impressively fast, but the flow is stable and remains laminar, as I didn't add any perturbations. I need to kick the transition to have turbulence. If I add white noise, even very low magnitude, conjugate gradient solver blows up again. I also tried adding some sinusoidal perturbations to boundary conditions, and again had troubles with CG. I don't really get CG's extreme sensitivity to perturbations. Any suggestion is much appreciated. Thanks in advance. Cheers, Asim On 02/08/2016 04:48 PM, Sherwin, Spencer J wrote: HI Asim, How many parallel cores are you running on. Sometime starting up these flows can be tricky especially if you are immediately jumping to a high Reynolds number. Have you tried first starting the flow at a Lower Reynolds number? Also 100 x 200 is quite a few elements in the x-y plane. Remember the polynomial order adds in more points on top of the mesh discretisation. I would perhaps recommend trying a smaller mesh to see how that goes first. Actually I note there is a file called TurbChFl_3D1H.xml in the ~/Nektar/Solvers/IncNavierStokesSolver/Examples directory which might be worth looking at. I think this was a mesh used in Ale Bolis’ thesis which you can find under: <http://wwwf.imperial.ac.uk/ssherw/spectralhp/papers/PhDThesis/Bolis_Thesis.pdf>http://wwwf.imperial.ac.uk/ssherw/spectralhp/papers/PhDThesis/Bolis_Thesis.pdf Cheers, Spencer. On 1 Feb 2016, at 07:01, ceeao<mailto:ceeao@nus.edu.sg><ceeao@nus.edu.sg><mailto:ceeao@nus.edu.sg> wrote: Hi Spencer, Thank you for the quick reply and suggestion. I switched indeed to 3D homo 1D case and this time I have problems with the divergence of linear solvers. I refined the grid in the channel flow example to 100x200x64 in x-y-z directions, and left everything else the same. When I employ the default global system solver "IterativeStaticCond" with this setup, I get divergence: "Exceeded maximum number of iterations (5000)". I checked the initial fields and mesh in Paraview, everything seems to be normal. I also tried the "LowEnergyBlock" preconditioner, and apparently this one is valid only in sheer 3D cases. My knowledge in iterative solvers for hp-Fem is minimal. Therefore, I was wondering if you could suggest maybe a robust option that at least converge. My concern is getting some rough estimates for the speed of Nektar++ in my oscillating channel flow problem. If the speed will be promising, I will switch to Nektar++ from OpenFOAM, as OpenFOAM is low-order and not really suitable for DNS. Thanks again in advance. Cheers, Asim On 01/31/2016 11:53 PM, Sherwin, Spencer J wrote: Hi Asim, I think your conclusions is correct. We did some early implementation into the 2D Homogeneous expansion but have not pulled it all the way through since we did not have a full project on this topic. We have however kept the existing code running through our regression test. For now I would perhaps suggest you try the 3D homo 1D approach for your runs since you can use parallelisation in that code. Cheers, Spencer. On 29 Jan 2016, at 04:00, ceeao<mailto:ceeao@nus.edu.sg><ceeao@nus.edu.sg><mailto:ceeao@nus.edu.sg> wrote: Dear all, I just installed the library, and need to simulate DNS of a channel flow with oscillating pressure gradient. As I have two homogeneous directions I applied standard Fourier discretization in these directions. It seems like this case is not parallelized yet, and I got the error in the subject. I was wondering if I'm overlooking something. If not, are there maybe any plans in the future to include parallelization of 2D FFT's? Thank you in advance. Best, Asim Onder Research Fellow National University of Singapore ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. _______________________________________________ Nektar-users mailing list <mailto:Nektar-users@imperial.ac.uk>Nektar-users@imperial.ac.uk<mailto:Nektar-users@imperial.ac.uk> <https://mailman.ic.ac.uk/mailman/listinfo/nektar-users>https://mailman.ic.ac.uk/mailman/listinfo/nektar-users Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ <mailto:s.sherwin@imperial.ac.uk>s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ <mailto:s.sherwin@imperial.ac.uk>s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ <mailto:s.sherwin@imperial.ac.uk>s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ <mailto:s.sherwin@imperial.ac.uk>s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ <mailto:s.sherwin@imperial.ac.uk>s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ <mailto:s.sherwin@imperial.ac.uk>s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ <mailto:s.sherwin@imperial.ac.uk>s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you.
Hi Douglas, Sorry, I noticed that I was doing a mistake somewhere else. All the results are fine now. The problem now is the very long execution time. The execution of "FieldConvert -m wss:bnd=1:addnormals=1 P0000003.xml field_128.chk P0000003_bnd.chk" takes more than six hours. I attached the log file for this one. If I try for example vorticity module on this partition, it takes around half an hour to process. So wss is considerably slower. What could be the reason for this? Cheers, Asim On 05/31/2016 08:28 PM, Serson, Douglas wrote: Hi Asim, When the chk file is imported the possibility of having different partitioning is taken into account. You shouldn't have to worry about that. Could you run the following commands and send me the output from the last one? NekMesh -m extract:surf=1 P0000003.xml P0000003_bnd.xml FieldConvert -m wss:bnd=1:addnormals=1 P0000003.xml field_128.chk P0000003_bnd.chk FieldConvert -e P0000003_bnd.xml P0000003_bnd_b1.chk test.fld This will show the norms of each field including the wss and the normals. It might help us figure out in which step of the process things are going wrong. Cheers, Douglas ________________________________ From: Asim Onder <ceeao@nus.edu.sg><mailto:ceeao@nus.edu.sg> Sent: 30 May 2016 16:59:16 To: Serson, Douglas Cc: Sherwin, Spencer J; nektar-users; Moxey, David Subject: Re: [Nektar-users] Parallel transposition not implemented yet for 3D-Homo-2D approach Hi Douglas, The only difference I could see is that I don’t have a P0000003.fld file which corresponds to the partition P0000003.xml. I rather use the full field in a chk directory partitioned into 2160 units during simulation runtime (P0000000.fld….P0002159.fld): FieldConvert -m wss:bnd=1:addnormals=1 P0000003.xml field_128.chk P0000003_bnd.chk Is this maybe the source of trouble? FieldConvert —part-only=10 config.xml field_128.chk partitions the mesh into 10 pieces but the old fld files in a chk directory, which are created during the simulation, remains unchanged. How would I adjust the field data into a new partitioning of the mesh? Thanks, Asim On May 30, 2016, at 10:51 PM, Serson, Douglas <<mailto:d.serson14@imperial.ac.uk>d.serson14@imperial.ac.uk<mailto:d.serson14@imperial.ac.uk>> wrote: Hi Asim, I was able to obtain the correct Norm_y using your file. Since I don't have a .fld for your case, I used the extract module instead of wss, but both of them generate the normals in the same way, so the result should be the same. The complete process I used is: 1 - Obtain a .xml file for the boundary: NekMesh -m extract:surf=1 P0000003.xml P0000003_bnd.xml (note that here surf is the composite, not the boundary region. In your case, it happens that both are the same) 2 - Use FieldConvert with the wss module: FieldConvert -m wss:bnd=1:addnormals P0000003.xml P0000003.fld P0000003_bnd.fld 3 - Convert to vtu: FieldConvert P0000003_bnd.xml P0000003_bnd_b1.fld P0000003_bnd.vtu Is this how you are processing your results? Cheers, Douglas ________________________________ From: Asim Onder <ceeao@nus.edu.sg<mailto:ceeao@nus.edu.sg>> Sent: 30 May 2016 14:12:03 To: Serson, Douglas; Sherwin, Spencer J Cc: nektar-users; Moxey, David Subject: Re: [Nektar-users] Parallel transposition not implemented yet for 3D-Homo-2D approach Hi Douglas, Thanks for the fix, and the tip for AeroForces-filter. Regarding the issues: 1. Shear stress is exactly zero everywhere. 2. After the fix, the segmentation fault is indeed gone. Unfortunately, this one also returned zero shear-stress field for my case. Another problem I noticed is that one of the surface normals (norm_y) is -1 everywhere, which is not true. Just to provide more information, I've attached snapshots of surface normals along with xml file of the partition. Thanks, Asim On 05/28/2016 12:14 AM, Serson, Douglas wrote: Hi Asim, About these issues: 1- I am not sure what could be happening. I tried these same steps with one of my cases and it works fine. Are the stresses exactly zero or just very small? 2- I found a bug in the wss module which is probably causing this. I think I was able to fix it (in branch fix/WssParallel), so this will probably be sorted out soon. 3- It is not possible to calculate the forces using FieldConvert. However, there is a filter (AeroForces) that does this as the simulation progresses. If you use your solution as the initial condition for a simulation with just one (or maybe even zero) time step, you should be able to use this filter to obtain the forces. Cheers, Douglas ________________________________ From: Asim Onder <mailto:ceeao@nus.edu.sg> <ceeao@nus.edu.sg><mailto:ceeao@nus.edu.sg> Sent: 27 May 2016 13:30:20 To: Serson, Douglas; Sherwin, Spencer J Cc: nektar-users; Moxey, David Subject: Re: [Nektar-users] Parallel transposition not implemented yet for 3D-Homo-2D approach Hi Douglas, I now have some troubles with postprocessing shear stress on a curved wall. Appreciate if you could provide some suggestions whenever you would be able to look at this. (just to remind my case: 3DH1D DNS of a channel flow with a wavy bottom. Around 16000 quad elements with NUMMODES="6" TYPE="MODIFIED" in xy, and HomModesZ=1080 in z direction.) There are three issues: 1. First, I tried to calculate the mean shear stress from a mean mode which is extracted using Fieldconvert, e.g.,: mpirun -np 48 FieldConvert -v -m meanmode config.xml fields/field_128.chk meanFields/mean_128.chk FieldConvert -v -m wss:bnd=1:addnormals=1 config.xml meanFields/mean_128.chk wssMean_128.chk Then, I converted the result to vtu and vizualized it. Normals to the surface are correctly calculated. However, shear stresses are zero, which is of course not true. 2. I also need to calculate shear-stress fluctuations and their probability distribution function. To this end, I partitioned the mesh into 10 units, and tried to extract instantaneous shear stress on the wall: FieldConvert -v -m wss:bnd=1:addnormals=1 config_xml/P0000001.xml fields/field_128.chk wallShearStress/wss_128_p01.chk and received a segmentation fault: ProcessWSS: Calculating wall shear stress... /var/spool/PBS/mom_priv/jobs/1511560.wlm01.SC: line 56: 40234 Segmentation fault FieldConvert -v -m wss:bnd=1:addnormals=1 config_xml/P0000001.xml fields/field_128.chk wallShearStress/wss_128_p01.chk I've put the log file for this one in attachment. 3. Finally, I need to calculate the drag force on the wall. wss returns shear stresses, along with pressure and normal vectors. I can use this information, and apply simple midpoint rule for integration. However, normal vectors seem to be normalized, hence of unity length, therefore, area information is missing. I was wondering if there is any easy way to extract the area of surface elements. Thanks you very much in advance for the feedback. Cheers, Asim On 05/06/2016 07:37 PM, Serson, Douglas wrote: Hi Asim, Thank you for reporting this postprocessing issue. We did find an operation that was consuming an unreasonable amount of time. I already fixed that, and eventually this fix will be available in the master branch. If you want to test it before then, it is in the branch fix/FC3DH1Defficiency. Cheers, Douglas ________________________________ From: Asim Onder <mailto:ceeao@nus.edu.sg> <ceeao@nus.edu.sg><mailto:ceeao@nus.edu.sg> Sent: 04 May 2016 07:07:45 To: Sherwin, Spencer J Cc: Serson, Douglas; nektar-users; Moxey, David Subject: Re: [Nektar-users] Parallel transposition not implemented yet for 3D-Homo-2D approach Hi Spencer, please find the requested .xml file in attachment. Cheers, Asim On 05/03/2016 10:15 PM, Sherwin, Spencer J wrote: Hi Asim, The si what I was afraid of. I do not know why your case is still taking so long. Can you send me the .xml file to have a look at. Thanks, Spencer. On 3 May 2016, at 10:17, Asim Onder <<mailto:ceeao@nus.edu.sg>ceeao@nus.edu.sg<mailto:ceeao@nus.edu.sg>> wrote: Hi Spencer, I have partitioned my mesh into 48 pieces, and applied Fieldconvert -v as you have suggested: FieldConvert -v -m vorticity config_xml/P0000000.xml config_10.chk vorPart_10.vtu The end of the output file looks like this: ...... InputXml session reader CPU Time: 0.036654s InputXml mesh graph setup CPU Time: 0.0949287s InputXml setexpansion CPU Time: 77.2126s InputXml setexpansion CPU Time: 5.66e-07s Collection Implemenation for Quadrilateral ( 6 6 ) for ngeoms = 648 BwdTrans: StdMat (0.000246074, 0.000233187, 6.70384e-05, 0.000117696) IProductWRTBase: StdMat (0.000299029, 0.000254921, 8.57054e-05, 0.000164536) IProductWRTDerivBase: StdMat (0.00147705, 0.000787602, 0.000234766, 0.000425167) PhysDeriv: SumFac (0.000471923, 0.000315652, 0.000244664, 0.000203107) InputXml set first exp CPU Time: 7453.92s InputXml CPU Time: 7531.26s Processing input fld file InputFld CPU Time: 211.413s ProcessVorticity: Calculating vorticity... OutputVtk: Writing file... Writing: "vorPart_12.vtu" Written file: vorPart_12.vtu Total CPU Time: 8059.78s "InputXml set first exp" seems to be consuming the most time. What would this correspond? Thanks, Asim On 05/02/2016 06:12 PM, Sherwin, Spencer J wrote: Hi Asim, Douglas may have the most experience with this size calculation. I have to admit it is a bit of a challenge currently. One suggestion is that you run with the -v option on FieldConvert so we can see where it is taking most of the time. I have had problems in 3D with simply readying the xml file and so we had done a bit of restricting to help this. I do not know if this might still be problem with the Homogeneous 1D code. If this is the case then in the 3D code what we sometimes do is repartition the mesh using FieldConvert - - part-only=16 config.xml out.fld This will produce a directory called config_xml with files called P0000000.xml P0000001.xml I then try and process one file at a time ./FieldConvert config_xml/P000000.xml config_10.chk out.vtu I wonder if this would help break up the work and hopefully speed up the processing? Cheers, Spencer. On 27 Apr 2016, at 08:36, Asim Onder <<mailto:ceeao@nus.edu.sg>ceeao@nus.edu.sg<mailto:ceeao@nus.edu.sg>> wrote: Hi Douglas, Spencer, Thanks for the suggestions, the problem is gone. I'm now a little concerned about the postprocessing of this relatively big case. For example, calculating vorticity from a snapshot in a chk folder takes several hours if I use a command like this: mpirun -np 720 FieldConvert -m vorticity config.xml config_10.chk vorticity_10.chk Changing the #procs didn't help too much. If I try to process individual domains one by one with something like this: FieldConvert --nprocs 72 --procid 1 -m vorticity config.xml config_10.chk vorticity_10.vtu It still seem to take hours. Just for a comparison: for this case, one time step of IncNavierStokesSolver takes around 5 seconds on 1440 procs with an initialization time of around 5mins. I guess I'm doing something wrong. Would you have any suggestions on this? Thank a lot in advance. Cheers, Asim On 04/22/2016 03:22 AM, Serson, Douglas wrote: Hi Asim, One thing I noticed about your setup is that HomModesZ / npz = 3. This should always be an even number, so you will need to change your parameters (for example using npz = 180). I am surprised no error message with this information was displayed, but this will definitely make your simulation crash. In terms of IO, as Spencer said you can pre-partition the mesh. However, I don't think this will make much difference since your mesh is 2D, and therefore does not use much memory anyway. As for the checkpoint file, as far as I know each process only tries to load one file at a time. If your checkpoint was obtained from a simulation with many cores, each file will be relatively small, and you should not have any problems. Cheers, Douglas ________________________________ From: Sherwin, Spencer J Sent: 21 April 2016 19:34 To: Asim Onder Cc: Serson, Douglas; nektar-users; Moxey, David Subject: Re: [Nektar-users] Parallel transposition not implemented yet for 3D-Homo-2D approach Hi Asim, In fully 3D simulations we tend to pre-partition the mesh and this can help with memory usage on a single core. To do this you can run the solver with the option - - part-only=’no of partitions of 2D planes’ Then instead of running with a file.xml you give the solver file_xml directory. However I am not sure whether this is all working with the 2.3 D code. Douglas is this how you start any of your runs? Cheers, Spencer. On 20 Apr 2016, at 05:48, Asim Onder <<mailto:ceeao@nus.edu.sg>ceeao@nus.edu.sg<mailto:ceeao@nus.edu.sg>> wrote: Hi Douglas, thanks for the feedback. I was aware of --npz parallelization but was using a small number, not 1/2 or 1/4 of HomModesZ. Increasing npz really helped. I still have to try GlobalSysSoln. Now I face a memory problem for another case. The simulation runs out of memory when starting from a checkpoint file. Here is a little bit information about this case: - Mesh is made of around 16000 quad elements with p=5, i.e., NUMMODES="6" TYPE="MODIFIED" in xy, and HomModesZ=1080 in z direction. - I'm trying to run this case on 60 computing nodes each equipped with 24 processors, and a memory of 105 gb. In total, it makes 1440 procs, and 6300gb memory. - Execution command: mpirun -np 1440 IncNavierStokesSolver --npz 360 config.xml I was wondering if the memory usage of the application is scaling on different cores during IO, or using only one core. If it is only one core, than if it exceeds 105gb, it crushes I guess. Would you have maybe any suggestion/comment on this? Thanks, Asim On 04/13/2016 12:12 AM, Serson, Douglas wrote: Hi Asim, Concerning your questions: 1- Are you using the command line argument --npz? This is very important for obtaining an efficient parallel performance with the Fourier expansion, since it defines the number of partitions in the z-direction. If it is not set, only the xy plane will be partitioned and the parallelism will saturate quickly. I suggest initially setting npz to 1/2 or 1/4 of HomModesZ (note that nprocs must be a multiple of npz, since nprocs/npz is the number of partitions in the xy plane). Also, depending on your particular case and the number of partitions you have in the xy plane, your simulation may benefit from using a direct solver for the linear systems. This can be activated by adding '-I GlobalSysSoln=XxtMultiLevelStaticCond' to the command line. This is usually more efficient for a small number of partitions, but considering the large size of your problem it might be worth trying it. 2- I am not sure what could be causing that. I suppose it would help if you could send the exact commands you are using to run FieldConvert. Cheers, Douglas ________________________________ From: Asim Onder <mailto:ceeao@nus.edu.sg> <ceeao@nus.edu.sg><mailto:ceeao@nus.edu.sg> Sent: 12 April 2016 06:42 To: Sherwin, Spencer J; Serson, Douglas Cc: nektar-users; Moxey, David Subject: Re: [Nektar-users] Parallel transposition not implemented yet for 3D-Homo-2D approach Dear Spencer, Douglas, Nektar-users, I'm involved now in testing of a local petascale supercomputer, and for some quite limited time I can use several thousand processors for my DNS study. My test case is oscillating flow over a rippled bed. I build up a dense unstructured grid with p=6 quadrilateral elements in x-y, and Fourier expansions in z directions. In total I have circa half billion dofs per variable. I would have a few questions about this relatively large case: 1. I noticed that scaling gets inefficient after around 500 procs, let's say parallel efficiency goes below 80%. I was wondering if you would have any general suggestions to tune the configurations for a better scaling. 2. Postprocessing vorticity and Q criterion is not working for this case. At the of the execution Fieldconvert writes some small files without the field data. What could be the reason for this? Thanks you in advance for your suggestions. Cheers, Asim On 03/21/2016 04:16 AM, Sherwin, Spencer J wrote: Hi Asim, To follow-up on Douglas’ comment we are trying to get more organised to sort out a developers guide. We are also holding a user meeting in June. If you were able to make this we could also try and have a session on getting you going on the developmental side of things. Cheers, Spencer. On 17 Mar 2016, at 14:58, Serson, Douglas <<mailto:d.serson14@imperial.ac.uk>d.serson14@imperial.ac.uk<mailto:d.serson14@imperial.ac.uk>> wrote: Hi Asim, I am glad that your simulation is now working. About your questions: 1. We have some work done on a filter for calculating Reynolds stresses as the simulation progresses, but it is not ready yet, and it would not provide all the statistics you want. Since you already have a lot of chk files, I suppose the best way would indeed be using a script to process all of them with FieldConvert. 2. Yes, this has been recently included in FieldConvert, using the new 'meanmode' module. 3. I just checked that, and apparently this is caused by a bug when using this module without fftw. This should be fixed soon, but as an alternative this module should work if you switch fftw on (just add <I PROPERTY="USEFFT" VALUE="FFTW"/> to you session file, if the code was compiled with support to fftw). 4. I think there is some work towards a developer guide, but I don't how advanced is the progress on that. I am sure Spencer will be able to provide you with more information on that. Cheers, Douglas ________________________________________ From: Asim Onder <<mailto:ceeao@nus.edu.sg>ceeao@nus.edu.sg<mailto:ceeao@nus.edu.sg>> Sent: 17 March 2016 09:10 To: Serson, Douglas; Sherwin, Spencer J Cc: nektar-users; Moxey, David Subject: Re: [Nektar-users] Parallel transposition not implemented yet for 3D-Homo-2D approach Hi Spencer, Douglas, Thanks to your suggestions I managed to get the turbulent regime for the oscillatory channel flow. I have now completed the DNS study for one case, and built up a large database with checkpoint (*chk) files. I would like to calculate turbulent statistics using this database, especially for second order terms, e.g. Reynolds stresses and turbulent dissipation, and third order terms, e.g. turbulent diffusion terms. However, I am a little bit confused how I could achieve this. I would appreciate if you could give some hints about the following: 1. The only way I could think of to calculate turbulent statistics is to write a simple bash script to iterate over chk files, and apply various existing/extended FieldConvert operations on individual chk files. This would require some additional storage to store the intermediate steps, and therefore would be a bit cumbersome. Would it be any simpler way directly doing this directly in Nektar++? 2. I have one homogeneous direction, for which I used Fourier expansions. I would like to apply spatial averaging over this homogeneous direction. Does Nektar++ already contain such functionality? 3. I want to use 'wss' in Fieldconvert module to calculate wall shear stress. However, it returns segmentation fault. Any ideas why it could be? 4. I was wondering if there is any introductory document for basic programming in Nektar++. User guide does not contain information about programming. It would be nice to have some additional information to Doxygen documentation. Thank you very much in advance for your feedback. Cheers, Asim On 02/15/2016 11:59 PM, Serson, Douglas wrote: Hi Asim, As Spencer mentioned, svv can help in stabilizing your solution. You can find information on how to set it up in the user guide (pages 92-93), but basically all you need to do is use: <I PROPERTY="SpectralVanishingViscosity" VALUE="True"/> You can also tune it by setting the parameters SVVCutoffRatio and SVVDiffCoeff, but I would suggest starting with the default parameters. Also, you can use the parameter IO_CFLSteps to output the CFL number. This way you can check if the time step you are using is appropriate. Cheers, Douglas From: Sherwin, Spencer J Sent: 14 February 2016 19:46 To: ceeao Cc: nektar-users; Serson, Douglas; Moxey, David Subject: Re: [Nektar-users] Parallel transposition not implemented yet for 3D-Homo-2D approach Hi Asim, Getting a flow through transition is very challenging since there is a strong localisation of shear and this can lead to aliasing issues which can then cause instabilities. Both Douglas and Dave have experienced this with recent simulations so I am cc’ing them to make some suggestions. I would be inclined to be using spectralhpdealiasing and svv. Hopefully Douglas can send you an example of how to switch this on. Cheers, Spencer. On 11 Feb 2016, at 10:32, ceeao<<mailto:ceeao@nus.edu.sg>ceeao@nus.edu.sg<mailto:ceeao@nus.edu.sg>> wrote: Hi Spencer, Nektar-Users, I followed the suggestion and coarsened the grid a bit. This way it worked impressively fast, but the flow is stable and remains laminar, as I didn't add any perturbations. I need to kick the transition to have turbulence. If I add white noise, even very low magnitude, conjugate gradient solver blows up again. I also tried adding some sinusoidal perturbations to boundary conditions, and again had troubles with CG. I don't really get CG's extreme sensitivity to perturbations. Any suggestion is much appreciated. Thanks in advance. Cheers, Asim On 02/08/2016 04:48 PM, Sherwin, Spencer J wrote: HI Asim, How many parallel cores are you running on. Sometime starting up these flows can be tricky especially if you are immediately jumping to a high Reynolds number. Have you tried first starting the flow at a Lower Reynolds number? Also 100 x 200 is quite a few elements in the x-y plane. Remember the polynomial order adds in more points on top of the mesh discretisation. I would perhaps recommend trying a smaller mesh to see how that goes first. Actually I note there is a file called TurbChFl_3D1H.xml in the ~/Nektar/Solvers/IncNavierStokesSolver/Examples directory which might be worth looking at. I think this was a mesh used in Ale Bolis’ thesis which you can find under: <http://wwwf.imperial.ac.uk/ssherw/spectralhp/papers/PhDThesis/Bolis_Thesis.pdf>http://wwwf.imperial.ac.uk/ssherw/spectralhp/papers/PhDThesis/Bolis_Thesis.pdf Cheers, Spencer. On 1 Feb 2016, at 07:01, ceeao<mailto:ceeao@nus.edu.sg><ceeao@nus.edu.sg><mailto:ceeao@nus.edu.sg> wrote: Hi Spencer, Thank you for the quick reply and suggestion. I switched indeed to 3D homo 1D case and this time I have problems with the divergence of linear solvers. I refined the grid in the channel flow example to 100x200x64 in x-y-z directions, and left everything else the same. When I employ the default global system solver "IterativeStaticCond" with this setup, I get divergence: "Exceeded maximum number of iterations (5000)". I checked the initial fields and mesh in Paraview, everything seems to be normal. I also tried the "LowEnergyBlock" preconditioner, and apparently this one is valid only in sheer 3D cases. My knowledge in iterative solvers for hp-Fem is minimal. Therefore, I was wondering if you could suggest maybe a robust option that at least converge. My concern is getting some rough estimates for the speed of Nektar++ in my oscillating channel flow problem. If the speed will be promising, I will switch to Nektar++ from OpenFOAM, as OpenFOAM is low-order and not really suitable for DNS. Thanks again in advance. Cheers, Asim On 01/31/2016 11:53 PM, Sherwin, Spencer J wrote: Hi Asim, I think your conclusions is correct. We did some early implementation into the 2D Homogeneous expansion but have not pulled it all the way through since we did not have a full project on this topic. We have however kept the existing code running through our regression test. For now I would perhaps suggest you try the 3D homo 1D approach for your runs since you can use parallelisation in that code. Cheers, Spencer. On 29 Jan 2016, at 04:00, ceeao<mailto:ceeao@nus.edu.sg><ceeao@nus.edu.sg><mailto:ceeao@nus.edu.sg> wrote: Dear all, I just installed the library, and need to simulate DNS of a channel flow with oscillating pressure gradient. As I have two homogeneous directions I applied standard Fourier discretization in these directions. It seems like this case is not parallelized yet, and I got the error in the subject. I was wondering if I'm overlooking something. If not, are there maybe any plans in the future to include parallelization of 2D FFT's? Thank you in advance. Best, Asim Onder Research Fellow National University of Singapore ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. _______________________________________________ Nektar-users mailing list <mailto:Nektar-users@imperial.ac.uk>Nektar-users@imperial.ac.uk<mailto:Nektar-users@imperial.ac.uk> <https://mailman.ic.ac.uk/mailman/listinfo/nektar-users>https://mailman.ic.ac.uk/mailman/listinfo/nektar-users Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ <mailto:s.sherwin@imperial.ac.uk>s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ <mailto:s.sherwin@imperial.ac.uk>s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ <mailto:s.sherwin@imperial.ac.uk>s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ <mailto:s.sherwin@imperial.ac.uk>s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ <mailto:s.sherwin@imperial.ac.uk>s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ <mailto:s.sherwin@imperial.ac.uk>s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. Spencer Sherwin McLaren Racing/Royal Academy of Engineering Research Chair, Professor of Computational Fluid Mechanics, Department of Aeronautics, Imperial College London South Kensington Campus London SW7 2AZ <mailto:s.sherwin@imperial.ac.uk>s.sherwin@imperial.ac.uk<mailto:s.sherwin@imperial.ac.uk> +44 (0) 20 759 45052 ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you. ________________________________ Important: This email is confidential and may be privileged. If you are not the intended recipient, please delete it and notify us immediately; you should not copy or use it for any purpose, nor disclose its contents to any other person. Thank you.
participants (3)
- 
                
                Asim Onder
- 
                
                Serson, Douglas
- 
                
                Sherwin, Spencer J