FWIW, Here is the output from ex13 on 32K processes (8K Fugaku nodes/sockets, 4 MPI/node, which seems recommended) with 128^3 vertex mesh (64^3 Q2 3D Laplacian). Almost an hour. Attached is solver scaling. 0 SNES Function norm 3.658334849208e+00 Linear solve converged due to CONVERGED_RTOL iterations 22 1 SNES Function norm 1.609000373074e-12 Nonlinear solve converged due to CONVERGED_ITS iterations 1 Linear solve converged due to CONVERGED_RTOL iterations 22 Linear solve converged due to CONVERGED_RTOL iterations 22 Linear solve converged due to CONVERGED_RTOL iterations 22 Linear solve converged due to CONVERGED_RTOL iterations 22 Linear solve converged due to CONVERGED_RTOL iterations 22 Linear solve converged due to CONVERGED_RTOL iterations 22 Linear solve converged due to CONVERGED_RTOL iterations 22 Linear solve converged due to CONVERGED_RTOL iterations 22 Linear solve converged due to CONVERGED_RTOL iterations 22 Linear solve converged due to CONVERGED_RTOL iterations 22 ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- ../ex13 on a named i07-4008c with 32768 processors, by a04199 Fri Feb 12 23:27:13 2021 Using Petsc Development GIT revision: v3.14.4-579-g4cb72fa GIT Date: 2021-02-05 15:19:40 +0000 Max Max/Min Avg Total Time (sec): 3.373e+03 1.000 3.373e+03 Objects: 1.055e+05 14.797 7.144e+03 Flop: 5.376e+10 1.176 4.885e+10 1.601e+15 Flop/sec: 1.594e+07 1.176 1.448e+07 4.745e+11 MPI Messages: 6.048e+05 30.010 8.833e+04 2.894e+09 MPI Message Lengths: 1.127e+09 4.132 6.660e+03 1.928e+13 MPI Reductions: 1.824e+03 1.000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flop and VecAXPY() for complex vectors of length N --> 8N flop Summary of Stages: ----- Time ------ ----- Flop ------ --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total Count %Total Avg %Total Count %Total 0: Main Stage: 3.2903e+03 97.5% 2.4753e+14 15.5% 3.538e+08 12.2% 1.779e+04 32.7% 9.870e+02 54.1% 1: PCSetUp: 4.3062e+01 1.3% 1.8160e+13 1.1% 1.902e+07 0.7% 3.714e+04 3.7% 1.590e+02 8.7% 2: KSP Solve only: 3.9685e+01 1.2% 1.3349e+15 83.4% 2.522e+09 87.1% 4.868e+03 63.7% 6.700e+02 36.7% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flop: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent AvgLen: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flop in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flop --- Global --- --- Stage ---- Total Max Ratio Max Ratio Max Ratio Mess AvgLen Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage PetscBarrier 5 1.0 1.9907e+00 2.2 0.00e+00 0.0 3.8e+06 7.7e+01 2.0e+01 0 0 0 0 1 0 0 1 0 2 0 BuildTwoSided 62 1.0 7.3272e+0214.1 0.00e+00 0.0 6.7e+06 8.0e+00 0.0e+00 5 0 0 0 0 5 0 2 0 0 0 BuildTwoSidedF 59 1.0 3.1132e+01 7.4 0.00e+00 0.0 4.8e+06 2.5e+05 0.0e+00 0 0 0 6 0 0 0 1 19 0 0 SNESSolve 1 1.0 1.7468e+02 1.0 7.83e+09 1.3 3.4e+08 1.3e+04 8.8e+02 5 13 12 23 48 5 85 96 70 89 1205779 SNESSetUp 1 1.0 2.4195e+01 1.0 0.00e+00 0.0 3.7e+06 3.7e+05 1.3e+01 1 0 0 7 1 1 0 1 22 1 0 SNESFunctionEval 3 1.0 1.1359e+01 1.2 1.17e+09 1.0 1.6e+06 1.4e+04 2.0e+00 0 2 0 0 0 0 15 0 0 0 3344744 SNESJacobianEval 2 1.0 1.6829e+02 1.0 1.52e+09 1.0 1.1e+06 8.3e+05 0.0e+00 5 3 0 5 0 5 20 0 14 0 293588 DMCreateMat 1 1.0 2.4107e+01 1.0 0.00e+00 0.0 3.7e+06 3.7e+05 1.3e+01 1 0 0 7 1 1 0 1 22 1 0 Mesh Partition 1 1.0 5.0133e+02 1.0 0.00e+00 0.0 1.3e+05 2.7e+02 6.0e+00 15 0 0 0 0 15 0 0 0 1 0 Mesh Migration 1 1.0 1.5494e+03 1.0 0.00e+00 0.0 7.3e+05 1.9e+02 2.4e+01 45 0 0 0 1 46 0 0 0 2 0 DMPlexPartSelf 1 1.0 1.1498e+002367.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 DMPlexPartLblInv 1 1.0 3.6698e+00 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 3.0e+00 0 0 0 0 0 0 0 0 0 0 0 DMPlexPartLblSF 1 1.0 2.8522e-01 1.7 0.00e+00 0.0 4.9e+04 1.5e+02 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 DMPlexPartStrtSF 1 1.0 4.9474e+023520.8 0.00e+00 0.0 3.3e+04 4.3e+02 0.0e+00 14 0 0 0 0 15 0 0 0 0 0 DMPlexPointSF 1 1.0 9.8750e+021264.8 0.00e+00 0.0 6.6e+04 5.4e+02 0.0e+00 28 0 0 0 0 29 0 0 0 0 0 DMPlexInterp 84 1.0 4.3219e-0158.6 0.00e+00 0.0 0.0e+00 0.0e+00 5.0e+00 0 0 0 0 0 0 0 0 0 1 0 DMPlexDistribute 1 1.0 3.0000e+03 1.5 0.00e+00 0.0 9.3e+05 2.3e+02 3.0e+01 88 0 0 0 2 90 0 0 0 3 0 DMPlexDistCones 1 1.0 1.0688e+03 2.6 0.00e+00 0.0 1.8e+05 3.1e+02 1.0e+00 31 0 0 0 0 31 0 0 0 0 0 DMPlexDistLabels 1 1.0 2.9172e+02 1.0 0.00e+00 0.0 3.1e+05 1.9e+02 2.1e+01 9 0 0 0 1 9 0 0 0 2 0 DMPlexDistField 1 1.0 1.8688e+02 1.2 0.00e+00 0.0 2.1e+05 9.3e+01 1.0e+00 5 0 0 0 0 5 0 0 0 0 0 DMPlexStratify 118 1.0 6.2852e+023280.9 0.00e+00 0.0 0.0e+00 0.0e+00 1.6e+01 1 0 0 0 1 1 0 0 0 2 0 DMPlexSymmetrize 118 1.0 6.7634e-02 2.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 DMPlexPrealloc 1 1.0 2.3741e+01 1.0 0.00e+00 0.0 3.7e+06 3.7e+05 1.1e+01 1 0 0 7 1 1 0 1 22 1 0 DMPlexResidualFE 3 1.0 1.0634e+01 1.2 1.16e+09 1.0 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 0 15 0 0 0 3569848 DMPlexJacobianFE 2 1.0 1.6809e+02 1.0 1.51e+09 1.0 6.5e+05 1.4e+06 0.0e+00 5 3 0 5 0 5 20 0 14 0 293801 SFSetGraph 87 1.0 2.7673e-03 3.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFSetUp 62 1.0 7.3283e+0213.6 0.00e+00 0.0 2.0e+07 2.7e+04 0.0e+00 5 0 1 3 0 5 0 6 9 0 0 SFBcastOpBegin 107 1.0 1.5770e+00452.5 0.00e+00 0.0 2.1e+07 1.8e+04 0.0e+00 0 0 1 2 0 0 0 6 6 0 0 SFBcastOpEnd 107 1.0 2.9430e+03 4.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 80 0 0 0 0 82 0 0 0 0 0 SFReduceBegin 12 1.0 2.4825e-01172.8 0.00e+00 0.0 2.4e+06 2.0e+05 0.0e+00 0 0 0 2 0 0 0 1 8 0 0 SFReduceEnd 12 1.0 3.8286e+014865.8 3.74e+04 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 31 SFFetchOpBegin 2 1.0 2.4497e-0390.2 0.00e+00 0.0 4.3e+05 3.5e+05 0.0e+00 0 0 0 1 0 0 0 0 2 0 0 SFFetchOpEnd 2 1.0 6.1349e-0210.9 0.00e+00 0.0 4.3e+05 3.5e+05 0.0e+00 0 0 0 1 0 0 0 0 2 0 0 SFCreateEmbed 3 1.0 3.6800e+013261.5 0.00e+00 0.0 4.7e+05 1.7e+03 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFDistSection 9 1.0 4.4325e+02 1.5 0.00e+00 0.0 2.8e+06 1.1e+04 9.0e+00 11 0 0 0 0 11 0 1 1 1 0 SFSectionSF 11 1.0 2.3898e+02 4.7 0.00e+00 0.0 9.2e+05 1.7e+05 0.0e+00 5 0 0 1 0 5 0 0 2 0 0 SFRemoteOff 2 1.0 3.2868e-0143.1 0.00e+00 0.0 8.7e+05 8.2e+03 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFPack 1023 1.0 2.5215e-0176.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFUnpack 1025 1.0 5.1600e-0216.8 5.62e+0521.3 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 54693 MatMult 1549525.4 3.4810e+00 1.3 4.35e+09 1.1 2.2e+08 6.1e+03 0.0e+00 0 8 8 7 0 0 54 62 21 0 38319208 MatMultAdd 132 1.0 6.9168e-01 3.0 7.97e+07 1.2 2.8e+07 4.6e+02 0.0e+00 0 0 1 0 0 0 1 8 0 0 3478717 MatMultTranspose 132 1.0 5.9967e-01 1.6 8.00e+07 1.2 3.0e+07 4.5e+02 0.0e+00 0 0 1 0 0 0 1 9 0 0 4015214 MatSolve 22 0.0 6.8431e-04 0.0 7.41e+05 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1082 MatLUFactorSym 1 1.0 5.9569e-0433.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatLUFactorNum 1 1.0 1.6236e-03773.2 1.46e+06 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 897 MatConvert 6 1.0 1.4290e-01 1.2 0.00e+00 0.0 3.0e+06 3.7e+03 0.0e+00 0 0 0 0 0 0 0 1 0 0 0 MatScale 18 1.0 3.7962e-01 1.3 4.11e+07 1.2 2.0e+06 5.5e+03 0.0e+00 0 0 0 0 0 0 0 1 0 0 3253392 MatResidual 132 1.0 6.8256e-01 1.4 8.27e+08 1.2 4.4e+07 5.5e+03 0.0e+00 0 2 2 1 0 0 10 13 4 0 36282014 MatAssemblyBegin 244 1.0 3.1181e+01 6.6 0.00e+00 0.0 4.8e+06 2.5e+05 0.0e+00 0 0 0 6 0 0 0 1 19 0 0 MatAssemblyEnd 244 1.0 6.3232e+00 1.9 3.17e+06 6.9 0.0e+00 0.0e+00 1.4e+02 0 0 0 0 8 0 0 0 0 15 7655 MatGetRowIJ 1 0.0 2.5780e-05 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatCreateSubMat 10 1.0 1.5162e+00 1.0 0.00e+00 0.0 1.6e+05 3.4e+05 1.3e+02 0 0 0 0 7 0 0 0 1 13 0 MatGetOrdering 1 0.0 1.0899e-04 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatCoarsen 6 1.0 3.5837e-01 1.3 0.00e+00 0.0 1.6e+07 1.2e+04 3.9e+01 0 0 1 1 2 0 0 5 3 4 0 MatZeroEntries 8 1.0 5.3730e-03 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatAXPY 6 1.0 2.6245e-01 1.1 2.66e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 33035 MatTranspose 12 1.0 3.0731e-02 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatMatMultSym 18 1.0 2.1398e+00 1.4 0.00e+00 0.0 6.1e+06 5.5e+03 4.8e+01 0 0 0 0 3 0 0 2 1 5 0 MatMatMultNum 6 1.0 1.1243e+00 1.0 3.76e+07 1.2 2.0e+06 5.5e+03 0.0e+00 0 0 0 0 0 0 0 1 0 0 1001203 MatPtAPSymbolic 6 1.0 1.7280e+01 1.0 0.00e+00 0.0 1.2e+07 3.2e+04 4.2e+01 1 0 0 2 2 1 0 3 6 4 0 MatPtAPNumeric 6 1.0 1.8047e+01 1.0 1.49e+09 5.1 2.8e+06 1.1e+05 2.4e+01 1 1 0 2 1 1 5 1 5 2 663675 MatTrnMatMultSym 1 1.0 3.0221e+01 1.0 0.00e+00 0.0 2.4e+06 5.8e+05 1.1e+01 1 0 0 7 1 1 0 1 22 1 0 MatGetLocalMat 19 1.0 1.3904e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetBrAoCol 18 1.0 1.9926e-01 5.0 0.00e+00 0.0 1.4e+07 2.3e+04 0.0e+00 0 0 0 2 0 0 0 4 5 0 0 MatGetSymTrans 2 1.0 1.8996e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecTDot 176 1.0 7.0632e-01 4.5 3.48e+07 1.0 0.0e+00 0.0e+00 1.8e+02 0 0 0 0 10 0 0 0 0 18 1608728 VecNorm 60 1.0 1.4074e+0012.2 1.58e+07 1.0 0.0e+00 0.0e+00 6.0e+01 0 0 0 0 3 0 0 0 0 6 366467 VecCopy 422 1.0 5.1259e-02 3.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 653 1.0 2.3974e-03 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 165 1.0 6.5622e-03 1.3 3.42e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 170485467 VecAYPX 861 1.0 7.8529e-02 1.2 6.21e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 1 0 0 0 25785252 VecAXPBYCZ 264 1.0 4.1343e-02 1.5 5.85e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 1 0 0 0 46135592 VecAssemblyBegin 21 1.0 2.3463e-01 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAssemblyEnd 21 1.0 1.4457e-04 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecPointwiseMult 600 1.0 5.7510e-02 1.2 2.66e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 15075754 VecScatterBegin 902 1.0 5.1188e-01 1.2 0.00e+00 0.0 2.9e+08 5.3e+03 0.0e+00 0 0 10 8 0 0 0 82 25 0 0 VecScatterEnd 902 1.0 1.2143e+00 3.2 5.50e+0537.9 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1347 VecSetRandom 6 1.0 2.6354e-02 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 DualSpaceSetUp 7 1.0 5.3467e-0112.0 4.26e+03 1.0 0.0e+00 0.0e+00 1.3e+01 0 0 0 0 1 0 0 0 0 1 261 FESetUp 7 1.0 1.7541e-01128.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 KSPSetUp 15 1.0 2.7470e-01 1.1 2.04e+08 1.2 1.0e+07 5.5e+03 1.3e+02 0 0 0 0 7 0 2 3 1 13 22477233 KSPSolve 1 1.0 4.3257e+00 1.0 4.33e+09 1.1 2.5e+08 4.8e+03 6.6e+01 0 8 9 6 4 0 54 72 20 7 30855976 PCGAMGGraph_AGG 6 1.0 5.0969e+00 1.0 3.76e+07 1.2 5.1e+06 4.4e+03 4.8e+01 0 0 0 0 3 0 0 1 0 5 220852 PCGAMGCoarse_AGG 6 1.0 3.1121e+01 1.0 0.00e+00 0.0 2.5e+07 6.9e+04 5.5e+01 1 0 1 9 3 1 0 7 27 6 0 PCGAMGProl_AGG 6 1.0 5.8196e-01 1.0 0.00e+00 0.0 6.6e+06 9.3e+03 7.2e+01 0 0 0 0 4 0 0 2 1 7 0 PCGAMGPOpt_AGG 6 1.0 3.2414e+00 1.0 2.42e+08 1.2 2.1e+07 5.3e+03 1.6e+02 0 0 1 1 9 0 3 6 2 17 2256493 GAMG: createProl 6 1.0 4.0042e+01 1.0 2.80e+08 1.2 5.8e+07 3.3e+04 3.4e+02 1 1 2 10 19 1 3 16 31 34 210778 Graph 12 1.0 5.0926e+00 1.0 3.76e+07 1.2 5.1e+06 4.4e+03 4.8e+01 0 0 0 0 3 0 0 1 0 5 221038 MIS/Agg 6 1.0 3.5850e-01 1.3 0.00e+00 0.0 1.6e+07 1.2e+04 3.9e+01 0 0 1 1 2 0 0 5 3 4 0 SA: col data 6 1.0 3.0509e-01 1.0 0.00e+00 0.0 5.4e+06 9.2e+03 2.4e+01 0 0 0 0 1 0 0 2 1 2 0 SA: frmProl0 6 1.0 2.3467e-01 1.1 0.00e+00 0.0 1.3e+06 9.5e+03 2.4e+01 0 0 0 0 1 0 0 0 0 2 0 SA: smooth 6 1.0 2.7855e+00 1.0 4.14e+07 1.2 8.1e+06 5.5e+03 6.3e+01 0 0 0 0 3 0 1 2 1 6 446491 GAMG: partLevel 6 1.0 3.7266e+01 1.0 1.49e+09 5.1 1.5e+07 4.9e+04 3.2e+02 1 1 1 4 17 1 5 4 12 32 321395 repartition 5 1.0 2.0343e+00 1.1 0.00e+00 0.0 4.0e+05 1.4e+05 2.5e+02 0 0 0 0 14 0 0 0 1 25 0 Invert-Sort 5 1.0 1.5021e-01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 3.0e+01 0 0 0 0 2 0 0 0 0 3 0 Move A 5 1.0 1.1548e+00 1.0 0.00e+00 0.0 1.6e+05 3.4e+05 7.0e+01 0 0 0 0 4 0 0 0 1 7 0 Move P 5 1.0 4.2799e-01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 7.5e+01 0 0 0 0 4 0 0 0 0 8 0 PCGAMG Squ l00 1 1.0 3.0221e+01 1.0 0.00e+00 0.0 2.4e+06 5.8e+05 1.1e+01 1 0 0 7 1 1 0 1 22 1 0 PCGAMG Gal l00 1 1.0 8.7411e+00 1.0 2.93e+08 1.1 5.4e+06 4.5e+04 1.2e+01 0 1 0 1 1 0 4 2 4 1 1092355 PCGAMG Opt l00 1 1.0 1.9734e+00 1.0 3.36e+07 1.1 3.2e+06 1.2e+04 9.0e+00 0 0 0 0 0 0 0 1 1 1 555327 PCGAMG Gal l01 1 1.0 1.0153e+00 1.0 3.50e+07 1.4 5.9e+06 3.9e+04 1.2e+01 0 0 0 1 1 0 0 2 4 1 1079887 PCGAMG Opt l01 1 1.0 7.4812e-02 1.0 5.35e+05 1.2 3.2e+06 1.1e+03 9.0e+00 0 0 0 0 0 0 0 1 0 1 232542 PCGAMG Gal l02 1 1.0 1.8063e+00 1.0 7.43e+07 0.0 3.0e+06 5.9e+04 1.2e+01 0 0 0 1 1 0 0 1 3 1 593392 PCGAMG Opt l02 1 1.0 1.1580e-01 1.1 6.93e+05 0.0 1.6e+06 1.3e+03 9.0e+00 0 0 0 0 0 0 0 0 0 1 93213 PCGAMG Gal l03 1 1.0 6.1075e+00 1.0 2.72e+08 0.0 2.6e+05 9.2e+04 1.1e+01 0 0 0 0 1 0 0 0 0 1 36155 PCGAMG Opt l03 1 1.0 8.0836e-02 1.0 1.55e+06 0.0 1.4e+05 1.4e+03 8.0e+00 0 0 0 0 0 0 0 0 0 1 18229 PCGAMG Gal l04 1 1.0 1.6203e+01 1.0 9.44e+08 0.0 1.4e+04 3.0e+05 1.1e+01 0 0 0 0 1 0 0 0 0 1 2366 PCGAMG Opt l04 1 1.0 1.2663e-01 1.0 2.01e+06 0.0 6.9e+03 2.2e+03 8.0e+00 0 0 0 0 0 0 0 0 0 1 817 PCGAMG Gal l05 1 1.0 1.4800e+00 1.0 3.16e+08 0.0 9.0e+01 1.6e+05 1.1e+01 0 0 0 0 1 0 0 0 0 1 796 PCGAMG Opt l05 1 1.0 8.1763e-02 1.1 2.50e+06 0.0 4.8e+01 4.6e+03 8.0e+00 0 0 0 0 0 0 0 0 0 1 114 PCSetUp 2 1.0 7.7969e+01 1.0 1.97e+09 2.8 8.3e+07 3.3e+04 8.1e+02 2 2 3 14 44 2 11 23 43 82 341051 PCSetUpOnBlocks 22 1.0 2.4609e-0317.2 1.46e+06 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 592 PCApply 22 1.0 3.6455e+00 1.1 3.57e+09 1.2 2.4e+08 4.3e+03 0.0e+00 0 7 8 5 0 0 43 67 16 0 29434967 --- Event Stage 1: PCSetUp BuildTwoSided 4 1.0 1.5980e-01 2.7 0.00e+00 0.0 2.1e+05 8.0e+00 0.0e+00 0 0 0 0 0 0 0 1 0 0 0 BuildTwoSidedF 6 1.0 1.3169e+01 5.5 0.00e+00 0.0 1.9e+06 1.9e+05 0.0e+00 0 0 0 2 0 28 0 10 51 0 0 SFSetGraph 5 1.0 4.9640e-0519.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFSetUp 4 1.0 1.6038e-01 2.3 0.00e+00 0.0 6.4e+05 9.1e+02 0.0e+00 0 0 0 0 0 0 0 3 0 0 0 SFPack 30 1.0 3.3376e-04 4.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFUnpack 30 1.0 1.2101e-05 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatMult 30 1.0 1.5544e-01 1.5 1.87e+08 1.2 1.0e+07 5.5e+03 0.0e+00 0 0 0 0 0 0 31 53 8 0 35930640 MatAssemblyBegin 43 1.0 1.3201e+01 4.7 0.00e+00 0.0 1.9e+06 1.9e+05 0.0e+00 0 0 0 2 0 28 0 10 51 0 0 MatAssemblyEnd 43 1.0 1.1159e+01 1.0 2.77e+07705.7 0.0e+00 0.0e+00 2.0e+01 0 0 0 0 1 26 0 0 0 13 1036 MatZeroEntries 6 1.0 4.7315e-0410.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatTranspose 12 1.0 2.5142e-02 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatMatMultSym 10 1.0 5.8783e-0117.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatPtAPSymbolic 5 1.0 1.4489e+01 1.0 0.00e+00 0.0 6.2e+06 3.6e+04 3.5e+01 0 0 0 1 2 34 0 32 31 22 0 MatPtAPNumeric 6 1.0 2.8457e+01 1.0 1.50e+09 5.1 2.7e+06 1.6e+05 2.0e+01 1 1 0 2 1 66 66 14 61 13 421190 MatGetLocalMat 6 1.0 9.8574e-03 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetBrAoCol 6 1.0 3.7669e-01 2.3 0.00e+00 0.0 5.1e+06 3.8e+04 0.0e+00 0 0 0 1 0 0 0 27 28 0 0 VecTDot 66 1.0 6.5271e-02 4.1 5.85e+06 1.0 0.0e+00 0.0e+00 6.6e+01 0 0 0 0 4 0 1 0 0 42 2922260 VecNorm 36 1.0 1.1226e-02 3.2 3.19e+06 1.0 0.0e+00 0.0e+00 3.6e+01 0 0 0 0 2 0 1 0 0 23 9268067 VecCopy 12 1.0 1.2805e-03 3.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 11 1.0 6.6620e-05 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 60 1.0 1.0763e-03 1.5 5.32e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 1 0 0 0 161104914 VecAYPX 24 1.0 2.0581e-03 1.3 2.13e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 33701038 VecPointwiseMult 36 1.0 3.5709e-03 1.3 1.60e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 14567861 VecScatterBegin 30 1.0 2.9079e-03 7.8 0.00e+00 0.0 1.0e+07 5.5e+03 0.0e+00 0 0 0 0 0 0 0 53 8 0 0 VecScatterEnd 30 1.0 3.7015e-0263.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 KSPSetUp 7 1.0 2.3165e-01 1.0 2.04e+08 1.2 1.0e+07 5.5e+03 1.0e+02 0 0 0 0 6 1 34 53 8 64 26654598 PCGAMG Gal l00 1 1.0 4.7415e+00 1.0 2.94e+08 1.1 1.8e+06 7.8e+04 0.0e+00 0 1 0 1 0 11 53 9 20 0 2015623 PCGAMG Gal l01 1 1.0 1.2103e+00 1.0 3.50e+07 1.4 4.8e+06 6.2e+04 1.2e+01 0 0 0 2 1 3 6 25 41 8 905938 PCGAMG Gal l02 1 1.0 3.4334e+00 1.0 7.41e+07 0.0 2.2e+06 8.7e+04 1.2e+01 0 0 0 1 1 8 6 11 27 8 312184 PCGAMG Gal l03 1 1.0 9.6062e+00 1.0 2.71e+08 0.0 1.9e+05 1.3e+05 1.1e+01 0 0 0 0 1 22 1 1 4 7 22987 PCGAMG Gal l04 1 1.0 2.2482e+01 1.0 9.43e+08 0.0 8.7e+03 4.8e+05 1.1e+01 1 0 0 0 1 52 0 0 1 7 1705 PCGAMG Gal l05 1 1.0 1.5961e+00 1.1 3.16e+08 0.0 6.8e+01 2.2e+05 1.1e+01 0 0 0 0 1 4 0 0 0 7 738 PCSetUp 1 1.0 4.3191e+01 1.0 1.70e+09 3.6 1.9e+07 3.7e+04 1.6e+02 1 1 1 4 9 100100100100100 420463 --- Event Stage 2: KSP Solve only SFPack 8140 1.0 7.4247e-02 4.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFUnpack 8140 1.0 1.2905e-02 5.2 5.50e+0637.9 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1267207 MatMult 5500 1.0 2.9994e+01 1.2 3.98e+10 1.1 2.0e+09 6.1e+03 0.0e+00 1 76 68 62 0 70 92 78 98 0 40747181 MatMultAdd 1320 1.0 6.2192e+00 2.7 7.97e+08 1.2 2.8e+08 4.6e+02 0.0e+00 0 2 10 1 0 14 2 11 1 0 3868976 MatMultTranspose 1320 1.0 4.0304e+00 1.7 8.00e+08 1.2 2.8e+08 4.6e+02 0.0e+00 0 2 10 1 0 7 2 11 1 0 5974153 MatSolve 220 0.0 6.7366e-03 0.0 7.41e+06 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1100 MatLUFactorSym 1 1.0 5.8691e-0435.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatLUFactorNum 1 1.0 1.5955e-03756.2 1.46e+06 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 913 MatResidual 1320 1.0 6.4920e+00 1.3 8.27e+09 1.2 4.4e+08 5.5e+03 0.0e+00 0 15 15 13 0 14 19 18 20 0 38146350 MatGetRowIJ 1 0.0 2.7820e-05 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetOrdering 1 0.0 9.6940e-05 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecTDot 440 1.0 4.6162e+00 6.9 2.31e+08 1.0 0.0e+00 0.0e+00 4.4e+02 0 0 0 0 24 5 1 0 0 66 1635124 VecNorm 230 1.0 3.9605e-02 1.6 1.21e+08 1.0 0.0e+00 0.0e+00 2.3e+02 0 0 0 0 13 0 0 0 0 34 99622387 VecCopy 3980 1.0 5.4166e-01 4.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 4640 1.0 1.4216e-02 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 440 1.0 4.2829e-02 1.3 2.31e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 1 0 0 0 176236363 VecAYPX 8130 1.0 7.3998e-01 1.2 5.78e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 2 1 0 0 0 25489392 VecAXPBYCZ 2640 1.0 3.9974e-01 1.5 5.85e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 1 1 0 0 0 47716315 VecPointwiseMult 5280 1.0 5.9845e-01 1.5 2.34e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 1 0 0 0 12748927 VecScatterBegin 8140 1.0 4.9231e-01 5.9 0.00e+00 0.0 2.5e+09 4.9e+03 0.0e+00 0 0 87 64 0 1 0100100 0 0 VecScatterEnd 8140 1.0 1.0172e+01 3.6 5.50e+0637.9 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 13 0 0 0 0 1608 KSPSetUp 1 1.0 9.5996e-07 3.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 KSPSolve 10 1.0 3.9685e+01 1.0 4.33e+10 1.1 2.5e+09 4.9e+03 6.7e+02 1 83 87 64 37 100100100100100 33637495 PCSetUp 1 1.0 2.4149e-0318.1 1.46e+06 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 603 PCSetUpOnBlocks 220 1.0 2.6945e-03 8.9 1.46e+06 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 540 PCApply 220 1.0 3.2921e+01 1.1 3.57e+10 1.2 2.3e+09 4.3e+03 0.0e+00 1 67 81 53 0 81 80 93 82 0 32595360 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Container 112 112 69888 0. SNES 1 1 1532 0. DMSNES 1 1 720 0. Distributed Mesh 449 449 30060888 0. DM Label 790 790 549840 0. Quadrature 579 579 379824 0. Index Set 100215 100210 361926232 0. IS L to G Mapping 8 13 4356552 0. Section 771 771 598296 0. Star Forest Graph 897 897 1053640 0. Discrete System 521 521 533512 0. GraphPartitioner 118 118 91568 0. Matrix 432 462 2441805304 0. Matrix Coarsen 6 6 4032 0. Vector 354 354 65492968 0. Linear Space 7 7 5208 0. Dual Space 111 111 113664 0. FE Space 7 7 5992 0. Field over DM 6 6 4560 0. Krylov Solver 21 21 37560 0. DMKSP interface 1 1 704 0. Preconditioner 21 21 21632 0. Viewer 2 1 896 0. PetscRandom 12 12 8520 0. --- Event Stage 1: PCSetUp Index Set 10 15 85367336 0. IS L to G Mapping 5 0 0 0. Star Forest Graph 5 5 6600 0. Matrix 50 20 73134024 0. Vector 28 28 6235096 0. --- Event Stage 2: KSP Solve only Index Set 5 5 8296 0. Matrix 1 1 273856 0. ======================================================================================================================== Average time to get PetscTime(): 6.40051e-08 Average time for MPI_Barrier(): 8.506e-06 Average time for zero size MPI_Send(): 6.6027e-06 #PETSc Option Table entries: -benchmark_it 10 -dm_distribute -dm_plex_box_dim 3 -dm_plex_box_faces 32,32,32 -dm_plex_box_lower 0,0,0 -dm_plex_box_simplex 0 -dm_plex_box_upper 1,1,1 -dm_refine 5 -ksp_converged_reason -ksp_max_it 150 -ksp_norm_type unpreconditioned -ksp_rtol 1.e-12 -ksp_type cg -log_view -matptap_via scalable -mg_levels_esteig_ksp_max_it 5 -mg_levels_esteig_ksp_type cg -mg_levels_ksp_max_it 2 -mg_levels_ksp_type chebyshev -mg_levels_pc_type jacobi -pc_gamg_agg_nsmooths 1 -pc_gamg_coarse_eq_limit 2000 -pc_gamg_coarse_grid_layout_type spread -pc_gamg_esteig_ksp_max_it 5 -pc_gamg_esteig_ksp_type cg -pc_gamg_process_eq_limit 500 -pc_gamg_repartition false -pc_gamg_reuse_interpolation true -pc_gamg_square_graph 1 -pc_gamg_threshold 0.01 -pc_gamg_threshold_scale .5 -pc_gamg_type agg -pc_type gamg -petscpartitioner_simple_node_grid 8,8,8 -petscpartitioner_simple_process_grid 4,4,4 -petscpartitioner_type simple -potential_petscspace_degree 2 -snes_converged_reason -snes_max_it 1 -snes_monitor -snes_rtol 1.e-8 -snes_type ksponly #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with 64 bit PetscInt Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 8 Configure options: CC=mpifccpx CXX=mpiFCCpx CFLAGS="-L /opt/FJSVxtclanga/tcsds-1.2.29/lib64 -lfjlapack" CXXFLAGS="-L /opt/FJSVxtclanga/tcsds-1.2.29/lib64 -lfjlapack" COPTFLAGS=-Kfast CXXOPTFLAGS=-Kfast --with-fc=0 --package-prefix-hash=/home/ra010009/a04199/petsc-hash-pkgs --with-batch=1 --with-shared-libraries=yes --with-debugging=no --with-64-bit-indices=1 PETSC_ARCH=arch-fugaku-fujitsu ----------------------------------------- Libraries compiled on 2021-02-12 02:27:41 on fn01sv08 Machine characteristics: Linux-3.10.0-957.27.2.el7.x86_64-x86_64-with-redhat-7.6-Maipo Using PETSc directory: /home/ra010009/a04199/petsc Using PETSc arch: ----------------------------------------- Using C compiler: mpifccpx -L /opt/FJSVxtclanga/tcsds-1.2.29/lib64 -lfjlapack -fPIC -Kfast ----------------------------------------- Using include paths: -I/home/ra010009/a04199/petsc/include -I/home/ra010009/a04199/petsc/arch-fugaku-fujitsu/include ----------------------------------------- Using C linker: mpifccpx Using libraries: -Wl,-rpath,/home/ra010009/a04199/petsc/lib -L/home/ra010009/a04199/petsc/lib -lpetsc -Wl,-rpath,/opt/FJSVxos/devkit/aarch64/lib/gcc/aarch64-linux-gnu/8 -L/opt/FJSVxos/devkit/aarch64/lib/gcc/aarch64-linux-gnu/8 -Wl,-rpath,/opt/FJSVxtclanga/tcsds-1.2.29/lib64 -L/opt/FJSVxtclanga/tcsds-1.2.29/lib64 -Wl,-rpath,/opt/FJSVxtclanga/.common/MELI022/lib64 -L/opt/FJSVxtclanga/.common/MELI022/lib64 -Wl,-rpath,/opt/FJSVxos/devkit/aarch64/aarch64-linux-gnu/lib64 -L/opt/FJSVxos/devkit/aarch64/aarch64-linux-gnu/lib64 -Wl,-rpath,/opt/FJSVxos/devkit/aarch64/rfs/usr/lib64 -L/opt/FJSVxos/devkit/aarch64/rfs/usr/lib64 -Wl,-rpath,/opt/FJSVxos/devkit/aarch64/rfs/opt/FJSVxos/mmm/lib64 -L/opt/FJSVxos/devkit/aarch64/rfs/opt/FJSVxos/mmm/lib64 -Wl,-rpath,/opt/FJSVxtclanga/tcsds-1.2.29/lib64/nofjobj -L/opt/FJSVxtclanga/tcsds-1.2.29/lib64/nofjobj -lX11 -lfjprofmpi -lfjlapack -ldl -lmpi_cxx -lmpi -lfjstring_internal -lfj90i -lfj90fmt_sve -lfj90f -lfjsrcinfo -lfjcrt -lfjprofcore -lfjprofomp -lfjc++ -lfjc++abi -lfjdemgl -lmpg -lm -lrt -lpthread -lelf -lz -lgcc_s -ldl -----------------------------------------