[petsc-users] GAMG Parallel Performance
Karin&NiKo
niko.karin at gmail.com
Thu Nov 15 10:50:46 CST 2018
Dear PETSc team,
I am solving a linear transient dynamic problem, based on a discretization
with finite elements. To do that, I am using FGMRES with GAMG as a
preconditioner. I consider here 10 time steps.
The problem has round to 118e6 dof and I am running on 1000, 1500 and 2000
procs. So I have something like 100e3, 78e3 and 50e3 dof/proc.
I notice that the performance deteriorates when I increase the number of
processes.
You can find as attached file the log_view of the execution and the
detailled definition of the KSP.
Is the problem too small to run on that number of processes or is there
something wrong with my use of GAMG?
I thank you in advance for your help,
Nicolas
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20181115/9c9caee6/attachment-0001.html>
-------------- next part --------------
---------------------------------------------- PETSc Performance Summary: ----------------------------------------------
Unknown Name on a arch-linux2-c-opt-mpi-ml-hypre named eocn0117 with 1000 processors, by B07947 Thu Nov 15 16:14:46 2018
Using Petsc Release Version 3.8.2, Nov, 09, 2017
Max Max/Min Avg Total
Time (sec): 1.661e+02 1.00034 1.661e+02
Objects: 1.401e+03 1.00143 1.399e+03
Flop: 7.695e+10 1.13672 7.354e+10 7.354e+13
Flop/sec: 4.633e+08 1.13672 4.428e+08 4.428e+11
MPI Messages: 3.697e+05 12.46258 1.179e+05 1.179e+08
MPI Message Lengths: 8.786e+08 3.98485 4.086e+03 4.817e+11
MPI Reductions: 2.635e+03 1.00000
Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
e.g., VecAXPY() for real vectors of length N --> 2N flop
and VecAXPY() for complex vectors of length N --> 8N flop
Summary of Stages: ----- Time ------ ----- Flop ----- --- Messages --- -- Message Lengths -- -- Reductions --
Avg %Total Avg %Total counts %Total Avg %Total counts %Total
0: Main Stage: 1.6608e+02 100.0% 7.3541e+13 100.0% 1.178e+08 99.9% 4.081e+03 99.9% 2.603e+03 98.8%
------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
Count: number of times phase was executed
Time and Flop: Max - maximum over all processors
Ratio - ratio of maximum to minimum over all processors
Mess: number of messages sent
Avg. len: average message length (bytes)
Reduct: number of global reductions
Global: entire computation
Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
%T - percent time in this phase %F - percent flop in this phase
%M - percent messages in this phase %L - percent message lengths in this phase
%R - percent reductions in this phase
Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over all processors)
------------------------------------------------------------------------------------------------------------------------
Event Count Time (sec) Flop --- Global --- --- Stage --- Total
Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------
--- Event Stage 0: Main Stage
MatMult 7342 1.0 4.4956e+01 1.4 4.09e+10 1.2 9.6e+07 4.3e+03 0.0e+00 23 53 81 86 0 23 53 81 86 0 859939
MatMultAdd 1130 1.0 3.4048e+00 2.3 1.55e+09 1.1 8.4e+06 8.2e+02 0.0e+00 2 2 7 1 0 2 2 7 1 0 434274
MatMultTranspose 1130 1.0 4.7555e+00 3.8 1.55e+09 1.1 8.4e+06 8.2e+02 0.0e+00 1 2 7 1 0 1 2 7 1 0 310924
MatSolve 226 0.0 6.8927e-04 0.0 6.24e+04 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 90
MatSOR 6835 1.0 3.6061e+01 1.4 2.85e+10 1.1 0.0e+00 0.0e+00 0.0e+00 20 37 0 0 0 20 37 0 0 0 760198
MatLUFactorSym 1 1.0 1.0800e-0390.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatLUFactorNum 1 1.0 8.0395e-04421.5 1.09e+03 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1
MatScale 15 1.0 1.7925e-02 1.8 9.12e+06 1.1 6.6e+04 1.1e+03 0.0e+00 0 0 0 0 0 0 0 0 0 0 485856
MatResidual 1130 1.0 6.3576e+00 1.5 5.31e+09 1.2 1.5e+07 3.7e+03 0.0e+00 3 7 13 11 0 3 7 13 11 0 781728
MatAssemblyBegin 112 1.0 9.9765e-01 3.0 0.00e+00 0.0 2.1e+05 7.8e+04 7.4e+01 0 0 0 3 3 0 0 0 3 3 0
MatAssemblyEnd 112 1.0 6.8845e-01 1.1 0.00e+00 0.0 8.3e+05 3.4e+02 2.6e+02 0 0 1 0 10 0 0 1 0 10 0
MatGetRow 582170 1.0 8.5022e-02 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatGetRowIJ 1 0.0 2.0885e-04 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatCreateSubMat 6 1.0 3.7804e-02 1.0 0.00e+00 0.0 5.6e+04 2.8e+03 1.0e+02 0 0 0 0 4 0 0 0 0 4 0
MatGetOrdering 1 0.0 4.4608e-04 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatCoarsen 5 1.0 3.2871e-02 1.1 0.00e+00 0.0 1.2e+06 4.9e+02 5.2e+01 0 0 1 0 2 0 0 1 0 2 0
MatZeroEntries 5 1.0 6.6769e-03 1.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatView 90 1.3 8.9249e-0216.6 0.00e+00 0.0 0.0e+00 0.0e+00 7.0e+01 0 0 0 0 3 0 0 0 0 3 0
MatAXPY 5 1.0 6.4984e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+01 0 0 0 0 0 0 0 0 0 0 0
MatMatMult 5 1.0 6.8333e-01 1.0 1.41e+08 1.2 3.7e+05 1.0e+04 8.2e+01 0 0 0 1 3 0 0 0 1 3 193093
MatMatMultSym 5 1.0 4.8541e-01 1.0 0.00e+00 0.0 3.0e+05 7.8e+03 7.0e+01 0 0 0 0 3 0 0 0 0 3 0
MatMatMultNum 5 1.0 1.9432e-01 1.0 1.41e+08 1.2 6.6e+04 2.2e+04 1.0e+01 0 0 0 0 0 0 0 0 0 0 679018
MatPtAP 5 1.0 4.2329e+00 1.0 1.54e+09 1.5 8.3e+05 4.3e+04 8.7e+01 3 2 1 7 3 3 2 1 7 3 292103
MatPtAPSymbolic 5 1.0 2.7832e+00 1.0 0.00e+00 0.0 3.5e+05 5.6e+04 3.7e+01 2 0 0 4 1 2 0 0 4 1 0
MatPtAPNumeric 5 1.0 1.4511e+00 1.0 1.54e+09 1.5 4.8e+05 3.3e+04 5.0e+01 1 2 0 3 2 1 2 0 3 2 852080
MatTrnMatMult 1 1.0 1.5337e+00 1.0 5.87e+07 1.3 6.9e+04 8.1e+04 1.9e+01 1 0 0 1 1 1 0 0 1 1 36505
MatTrnMatMultSym 1 1.0 9.2151e-01 1.0 0.00e+00 0.0 5.7e+04 3.4e+04 1.7e+01 1 0 0 0 1 1 0 0 0 1 0
MatTrnMatMultNum 1 1.0 6.1297e-01 1.0 5.87e+07 1.3 1.1e+04 3.2e+05 2.0e+00 0 0 0 1 0 0 0 0 1 0 91341
MatGetLocalMat 17 1.0 5.4432e-02 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatGetBrAoCol 15 1.0 7.0758e-02 2.1 0.00e+00 0.0 4.6e+05 4.2e+04 0.0e+00 0 0 0 4 0 0 0 0 4 0 0
VecMDot 329 1.0 6.2030e+0013.7 6.68e+08 1.0 0.0e+00 0.0e+00 3.3e+02 1 1 0 0 12 1 1 0 0 13 106230
VecNorm 595 1.0 1.1655e+00 8.5 1.21e+08 1.0 0.0e+00 0.0e+00 6.0e+02 0 0 0 0 23 0 0 0 0 23 102761
VecScale 349 1.0 6.6033e-02 4.6 3.13e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 467735
VecCopy 1386 1.0 1.0624e-01 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecSet 4392 1.0 8.6035e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecAXPY 246 1.0 4.8357e-02 1.4 5.69e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1160750
VecAYPX 9276 1.0 4.4571e-01 1.4 3.11e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 687917
VecAXPBYCZ 4520 1.0 2.8744e-01 1.4 5.66e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 1939847
VecMAXPY 575 1.0 8.4132e-01 1.5 1.36e+09 1.0 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 0 2 0 0 0 1600021
VecAssemblyBegin 185 1.0 6.6342e-02 1.3 0.00e+00 0.0 2.3e+04 2.2e+04 5.5e+02 0 0 0 0 21 0 0 0 0 21 0
VecAssemblyEnd 185 1.0 4.2391e-04 4.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecPointwiseMult 55 1.0 3.7224e-03 1.5 1.38e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 364534
VecScatterBegin 9786 1.0 8.6765e-01 5.5 0.00e+00 0.0 1.1e+08 3.8e+03 0.0e+00 0 0 97 90 0 0 0 97 90 0 0
VecScatterEnd 9786 1.0 1.9699e+01 9.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 5 0 0 0 0 5 0 0 0 0 0
VecSetRandom 5 1.0 4.3778e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecNormalize 113 1.0 9.7592e-02 3.3 9.34e+06 1.0 0.0e+00 0.0e+00 1.1e+02 0 0 0 0 4 0 0 0 0 4 94297
KSPGMRESOrthog 326 1.0 6.4559e+00 9.1 1.33e+09 1.0 0.0e+00 0.0e+00 3.3e+02 1 2 0 0 12 1 2 0 0 13 203262
KSPSetUp 18 1.0 1.4065e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.4e+01 0 0 0 0 1 0 0 0 0 1 0
KSPSolve 10 1.0 7.9545e+01 1.0 7.50e+10 1.1 1.1e+08 3.8e+03 8.1e+02 48 98 95 89 31 48 98 95 89 31 903224
PCGAMGGraph_AGG 5 1.0 1.2315e+00 1.0 2.25e+06 1.2 3.3e+05 4.2e+02 1.3e+02 1 0 0 0 5 1 0 0 0 5 1759
PCGAMGCoarse_AGG 5 1.0 1.5847e+00 1.0 5.87e+07 1.3 1.3e+06 5.2e+03 8.7e+01 1 0 1 1 3 1 0 1 1 3 35331
PCGAMGProl_AGG 5 1.0 3.5152e-01 1.0 0.00e+00 0.0 2.3e+06 1.5e+03 9.0e+02 0 0 2 1 34 0 0 2 1 35 0
PCGAMGPOpt_AGG 5 1.0 1.0543e+00 1.0 4.17e+08 1.2 1.0e+06 6.1e+03 2.4e+02 1 1 1 1 9 1 1 1 1 9 372220
GAMG: createProl 5 1.0 4.2217e+00 1.0 4.78e+08 1.2 5.0e+06 3.3e+03 1.4e+03 3 1 4 3 52 3 1 4 3 52 106734
Graph 10 1.0 1.2300e+00 1.0 2.25e+06 1.2 3.3e+05 4.2e+02 1.3e+02 1 0 0 0 5 1 0 0 0 5 1761
MIS/Agg 5 1.0 3.2935e-02 1.1 0.00e+00 0.0 1.2e+06 4.9e+02 5.2e+01 0 0 1 0 2 0 0 1 0 2 0
SA: col data 5 1.0 1.3732e-01 1.0 0.00e+00 0.0 2.2e+06 1.2e+03 8.4e+02 0 0 2 1 32 0 0 2 1 32 0
SA: frmProl0 5 1.0 2.0841e-01 1.0 0.00e+00 0.0 9.5e+04 7.1e+03 5.0e+01 0 0 0 0 2 0 0 0 0 2 0
SA: smooth 5 1.0 7.6907e-01 1.0 1.48e+08 1.2 3.7e+05 1.0e+04 1.0e+02 0 0 0 1 4 0 0 0 1 4 180072
GAMG: partLevel 5 1.0 4.2824e+00 1.0 1.54e+09 1.5 8.9e+05 4.0e+04 2.5e+02 3 2 1 7 9 3 2 1 7 9 288729
repartition 3 1.0 2.1951e-03 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 1.8e+01 0 0 0 0 1 0 0 0 0 1 0
Invert-Sort 3 1.0 2.9290e-03 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01 0 0 0 0 0 0 0 0 0 0 0
Move A 3 1.0 2.8378e-02 1.1 0.00e+00 0.0 3.0e+04 5.2e+03 5.4e+01 0 0 0 0 2 0 0 0 0 2 0
Move P 3 1.0 1.5999e-02 1.2 0.00e+00 0.0 2.6e+04 4.0e+01 5.4e+01 0 0 0 0 2 0 0 0 0 2 0
PCSetUp 2 1.0 8.5208e+00 1.0 2.01e+09 1.4 5.8e+06 8.9e+03 1.6e+03 5 2 5 11 62 5 2 5 11 63 197991
PCSetUpOnBlocks 226 1.0 1.7779e-0310.4 1.09e+03 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1
PCApply 226 1.0 6.9594e+01 1.1 6.40e+10 1.1 1.1e+08 3.3e+03 1.0e+02 41 83 90 72 4 41 83 90 72 4 878121
SFSetGraph 5 1.0 5.3883e-0556.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
SFBcastBegin 62 1.0 9.9101e-03 1.7 0.00e+00 0.0 1.2e+06 4.9e+02 0.0e+00 0 0 1 0 0 0 0 1 0 0 0
SFBcastEnd 62 1.0 6.8467e-03 6.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
BuildTwoSided 5 1.0 7.4060e-03 2.8 0.00e+00 0.0 3.3e+04 4.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
------------------------------------------------------------------------------------------------------------------------
Memory usage is given in bytes:
Object Type Creations Destructions Memory Descendants' Mem.
Reports information only for process 0.
--- Event Stage 0: Main Stage
Matrix 154 154 347782424 0.
Matrix Coarsen 5 5 3140 0.
Matrix Null Space 1 1 688 0.
Vector 1035 1035 582369520 0.
Vector Scatter 36 36 39744 0.
Index Set 112 112 484084 0.
Krylov Solver 18 18 330336 0.
Preconditioner 13 13 12868 0.
PetscRandom 10 10 6380 0.
Star Forest Graph 5 5 4280 0.
Viewer 12 11 9152 0.
========================================================================================================================
0 KSP unpreconditioned resid norm 3.738834777485e+08 true resid norm 3.738834777485e+08 ||r(i)||/||b|| 1.000000000000e+00
1 KSP unpreconditioned resid norm 1.256707592053e+08 true resid norm 1.256707592053e+08 ||r(i)||/||b|| 3.361227940911e-01
2 KSP unpreconditioned resid norm 1.824938621520e+07 true resid norm 1.824938621520e+07 ||r(i)||/||b|| 4.881035750789e-02
3 KSP unpreconditioned resid norm 6.102002718084e+06 true resid norm 6.102002718084e+06 ||r(i)||/||b|| 1.632060008329e-02
4 KSP unpreconditioned resid norm 2.562432902883e+06 true resid norm 2.562432902883e+06 ||r(i)||/||b|| 6.853560147439e-03
5 KSP unpreconditioned resid norm 1.188336046012e+06 true resid norm 1.188336046012e+06 ||r(i)||/||b|| 3.178359346520e-03
6 KSP unpreconditioned resid norm 5.326022866065e+05 true resid norm 5.326022866065e+05 ||r(i)||/||b|| 1.424514102131e-03
7 KSP unpreconditioned resid norm 2.433972087119e+05 true resid norm 2.433972087119e+05 ||r(i)||/||b|| 6.509974984122e-04
8 KSP unpreconditioned resid norm 1.095996827533e+05 true resid norm 1.095996827533e+05 ||r(i)||/||b|| 2.931386094225e-04
9 KSP unpreconditioned resid norm 4.986951871355e+04 true resid norm 4.986951871355e+04 ||r(i)||/||b|| 1.333825153597e-04
10 KSP unpreconditioned resid norm 2.330078182947e+04 true resid norm 2.330078182946e+04 ||r(i)||/||b|| 6.232097221779e-05
11 KSP unpreconditioned resid norm 1.084965391397e+04 true resid norm 1.084965391396e+04 ||r(i)||/||b|| 2.901881083191e-05
12 KSP unpreconditioned resid norm 5.108480961660e+03 true resid norm 5.108480961647e+03 ||r(i)||/||b|| 1.366329689776e-05
13 KSP unpreconditioned resid norm 2.450752492671e+03 true resid norm 2.450752492670e+03 ||r(i)||/||b|| 6.554856361741e-06
14 KSP unpreconditioned resid norm 1.181086403619e+03 true resid norm 1.181086403614e+03 ||r(i)||/||b|| 3.158969234817e-06
15 KSP unpreconditioned resid norm 5.606721134498e+02 true resid norm 5.606721134433e+02 ||r(i)||/||b|| 1.499590505629e-06
16 KSP unpreconditioned resid norm 2.700319247455e+02 true resid norm 2.700319247344e+02 ||r(i)||/||b|| 7.222355113430e-07
17 KSP unpreconditioned resid norm 1.314293551958e+02 true resid norm 1.314293551859e+02 ||r(i)||/||b|| 3.515249081809e-07
18 KSP unpreconditioned resid norm 6.357572858020e+01 true resid norm 6.357572858253e+01 ||r(i)||/||b|| 1.700415567047e-07
19 KSP unpreconditioned resid norm 3.077536939056e+01 true resid norm 3.077536939188e+01 ||r(i)||/||b|| 8.231272902779e-08
20 KSP unpreconditioned resid norm 1.504910881547e+01 true resid norm 1.504910882709e+01 ||r(i)||/||b|| 4.025079930707e-08
21 KSP unpreconditioned resid norm 7.400345249992e+00 true resid norm 7.400345259132e+00 ||r(i)||/||b|| 1.979318611161e-08
22 KSP unpreconditioned resid norm 3.607811417234e+00 true resid norm 3.607811420482e+00 ||r(i)||/||b|| 9.649560986776e-09
Linear solve converged due to CONVERGED_RTOL iterations 22
KSP Object: 1000 MPI processes
type: fgmres
restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
happy breakdown tolerance 1e-30
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-08, absolute=1e-50, divergence=10000.
right preconditioning
using UNPRECONDITIONED norm type for convergence test
PC Object: 1000 MPI processes
type: gamg
type is MULTIPLICATIVE, levels=6 cycles=v
Cycles per PCApply=1
Using externally compute Galerkin coarse grid matrices
GAMG specific options
Threshold for dropping small values in graph on each level = 0. 0. 0. 0.
Threshold scaling factor for each level not specified = 1.
AGG specific options
Symmetric graph false
Number of levels to square graph 1
Number smoothing steps 1
Coarse grid solver -- level -------------------------------
KSP Object: (mg_coarse_) 1000 MPI processes
type: preonly
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
left preconditioning
using NONE norm type for convergence test
PC Object: (mg_coarse_) 1000 MPI processes
type: bjacobi
number of blocks = 1000
Local solve is same for all blocks, in the following KSP and PC objects:
KSP Object: (mg_coarse_sub_) 1 MPI processes
type: preonly
maximum iterations=1, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
left preconditioning
using NONE norm type for convergence test
PC Object: (mg_coarse_sub_) 1 MPI processes
type: lu
out-of-place factorization
tolerance for zero pivot 2.22045e-14
using diagonal shift on blocks to prevent zero pivot [INBLOCKS]
matrix ordering: nd
factor fill ratio given 5., needed 1.
Factored matrix follows:
Mat Object: 1 MPI processes
type: seqaij
rows=12, cols=12, bs=6
package used to perform factorization: petsc
total: nonzeros=144, allocated nonzeros=144
total number of mallocs used during MatSetValues calls =0
using I-node routines: found 3 nodes, limit used is 5
linear system matrix = precond matrix:
Mat Object: 1 MPI processes
type: seqaij
rows=12, cols=12, bs=6
total: nonzeros=144, allocated nonzeros=144
total number of mallocs used during MatSetValues calls =0
using I-node routines: found 3 nodes, limit used is 5
linear system matrix = precond matrix:
Mat Object: 1000 MPI processes
type: mpiaij
rows=12, cols=12, bs=6
total: nonzeros=144, allocated nonzeros=144
total number of mallocs used during MatSetValues calls =0
using I-node (on process 0) routines: found 3 nodes, limit used is 5
Down solver (pre-smoother) on level 1 -------------------------------
KSP Object: (mg_levels_1_) 1000 MPI processes
type: chebyshev
eigenvalue estimates used: min = 0.0999997, max = 1.1
eigenvalues estimate via gmres min 0.0078618, max 0.999997
eigenvalues estimated using gmres with translations [0. 0.1; 0. 1.1]
KSP Object: (mg_levels_1_esteig_) 1000 MPI processes
type: gmres
restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
happy breakdown tolerance 1e-30
maximum iterations=10, initial guess is zero
tolerances: relative=1e-12, absolute=1e-50, divergence=10000.
left preconditioning
using PRECONDITIONED norm type for convergence test
estimating eigenvalues using noisy right hand side
maximum iterations=2, nonzero initial guess
tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
left preconditioning
using NONE norm type for convergence test
PC Object: (mg_levels_1_) 1000 MPI processes
type: sor
type = local_symmetric, iterations = 1, local iterations = 1, omega = 1.
linear system matrix = precond matrix:
Mat Object: 1000 MPI processes
type: mpiaij
rows=288, cols=288, bs=6
total: nonzeros=78408, allocated nonzeros=78408
total number of mallocs used during MatSetValues calls =0
using I-node (on process 0) routines: found 86 nodes, limit used is 5
Up solver (post-smoother) same as down solver (pre-smoother)
Down solver (pre-smoother) on level 2 -------------------------------
KSP Object: (mg_levels_2_) 1000 MPI processes
type: chebyshev
eigenvalue estimates used: min = 0.139457, max = 1.53403
eigenvalues estimate via gmres min 0.077969, max 1.39457
eigenvalues estimated using gmres with translations [0. 0.1; 0. 1.1]
KSP Object: (mg_levels_2_esteig_) 1000 MPI processes
type: gmres
restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
happy breakdown tolerance 1e-30
maximum iterations=10, initial guess is zero
tolerances: relative=1e-12, absolute=1e-50, divergence=10000.
left preconditioning
using PRECONDITIONED norm type for convergence test
estimating eigenvalues using noisy right hand side
maximum iterations=2, nonzero initial guess
tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
left preconditioning
using NONE norm type for convergence test
PC Object: (mg_levels_2_) 1000 MPI processes
type: sor
type = local_symmetric, iterations = 1, local iterations = 1, omega = 1.
linear system matrix = precond matrix:
Mat Object: 1000 MPI processes
type: mpiaij
rows=10254, cols=10254, bs=6
total: nonzeros=12883716, allocated nonzeros=12883716
total number of mallocs used during MatSetValues calls =0
using I-node (on process 0) routines: found 8 nodes, limit used is 5
Up solver (post-smoother) same as down solver (pre-smoother)
Down solver (pre-smoother) on level 3 -------------------------------
KSP Object: (mg_levels_3_) 1000 MPI processes
type: chebyshev
eigenvalue estimates used: min = 0.14493, max = 1.59423
eigenvalues estimate via gmres min 0.356008, max 1.4493
eigenvalues estimated using gmres with translations [0. 0.1; 0. 1.1]
KSP Object: (mg_levels_3_esteig_) 1000 MPI processes
type: gmres
restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
happy breakdown tolerance 1e-30
maximum iterations=10, initial guess is zero
tolerances: relative=1e-12, absolute=1e-50, divergence=10000.
left preconditioning
using PRECONDITIONED norm type for convergence test
estimating eigenvalues using noisy right hand side
maximum iterations=2, nonzero initial guess
tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
left preconditioning
using NONE norm type for convergence test
PC Object: (mg_levels_3_) 1000 MPI processes
type: sor
type = local_symmetric, iterations = 1, local iterations = 1, omega = 1.
linear system matrix = precond matrix:
Mat Object: 1000 MPI processes
type: mpiaij
rows=332466, cols=332466, bs=6
total: nonzeros=286141284, allocated nonzeros=286141284
total number of mallocs used during MatSetValues calls =0
using nonscalable MatPtAP() implementation
using I-node (on process 0) routines: found 88 nodes, limit used is 5
Up solver (post-smoother) same as down solver (pre-smoother)
Down solver (pre-smoother) on level 4 -------------------------------
KSP Object: (mg_levels_4_) 1000 MPI processes
type: chebyshev
eigenvalue estimates used: min = 0.175972, max = 1.93569
eigenvalues estimate via gmres min 0.145536, max 1.75972
eigenvalues estimated using gmres with translations [0. 0.1; 0. 1.1]
KSP Object: (mg_levels_4_esteig_) 1000 MPI processes
type: gmres
restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
happy breakdown tolerance 1e-30
maximum iterations=10, initial guess is zero
tolerances: relative=1e-12, absolute=1e-50, divergence=10000.
left preconditioning
using PRECONDITIONED norm type for convergence test
estimating eigenvalues using noisy right hand side
maximum iterations=2, nonzero initial guess
tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
left preconditioning
using NONE norm type for convergence test
PC Object: (mg_levels_4_) 1000 MPI processes
type: sor
type = local_symmetric, iterations = 1, local iterations = 1, omega = 1.
linear system matrix = precond matrix:
Mat Object: 1000 MPI processes
type: mpiaij
rows=5142126, cols=5142126, bs=6
total: nonzeros=1363101804, allocated nonzeros=1363101804
total number of mallocs used during MatSetValues calls =0
using nonscalable MatPtAP() implementation
using I-node (on process 0) routines: found 1522 nodes, limit used is 5
Up solver (post-smoother) same as down solver (pre-smoother)
Down solver (pre-smoother) on level 5 -------------------------------
KSP Object: (mg_levels_5_) 1000 MPI processes
type: chebyshev
eigenvalue estimates used: min = 0.234733, max = 2.58207
eigenvalues estimate via gmres min 0.061528, max 2.34733
eigenvalues estimated using gmres with translations [0. 0.1; 0. 1.1]
KSP Object: (mg_levels_5_esteig_) 1000 MPI processes
type: gmres
restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
happy breakdown tolerance 1e-30
maximum iterations=10, initial guess is zero
tolerances: relative=1e-12, absolute=1e-50, divergence=10000.
left preconditioning
using PRECONDITIONED norm type for convergence test
estimating eigenvalues using noisy right hand side
maximum iterations=2, nonzero initial guess
tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
left preconditioning
using NONE norm type for convergence test
PC Object: (mg_levels_5_) 1000 MPI processes
type: sor
type = local_symmetric, iterations = 1, local iterations = 1, omega = 1.
linear system matrix = precond matrix:
Mat Object: 1000 MPI processes
type: mpiaij
rows=117874305, cols=117874305, bs=3
total: nonzeros=9333251991, allocated nonzeros=9333251991
total number of mallocs used during MatSetValues calls =0
has attached near null space
using I-node (on process 0) routines: found 39198 nodes, limit used is 5
Up solver (post-smoother) same as down solver (pre-smoother)
linear system matrix = precond matrix:
Mat Object: 1000 MPI processes
type: mpiaij
rows=117874305, cols=117874305, bs=3
total: nonzeros=9333251991, allocated nonzeros=9333251991
total number of mallocs used during MatSetValues calls =0
has attached near null space
using I-node (on process 0) routines: found 39198 nodes, limit used is 5
-------------- next part --------------
---------------------------------------------- PETSc Performance Summary: ----------------------------------------------
Unknown Name on a arch-linux2-c-opt-mpi-ml-hypre named eobm0011 with 2000 processors, by B07947 Thu Nov 15 15:47:29 2018
Using Petsc Release Version 3.8.2, Nov, 09, 2017
Max Max/Min Avg Total
Time (sec): 2.837e+02 1.00021 2.836e+02
Objects: 1.409e+03 1.00142 1.407e+03
Flop: 3.920e+10 1.16752 3.710e+10 7.420e+13
Flop/sec: 1.382e+08 1.16751 1.308e+08 2.616e+11
MPI Messages: 4.031e+05 10.62284 1.243e+05 2.486e+08
MPI Message Lengths: 6.348e+08 4.13328 2.721e+03 6.762e+11
MPI Reductions: 2.654e+03 1.00000
Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
e.g., VecAXPY() for real vectors of length N --> 2N flop
and VecAXPY() for complex vectors of length N --> 8N flop
Summary of Stages: ----- Time ------ ----- Flop ----- --- Messages --- -- Message Lengths -- -- Reductions --
Avg %Total Avg %Total counts %Total Avg %Total counts %Total
0: Main Stage: 2.8364e+02 100.0% 7.4202e+13 100.0% 2.484e+08 99.9% 2.718e+03 99.9% 2.622e+03 98.8%
------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
Count: number of times phase was executed
Time and Flop: Max - maximum over all processors
Ratio - ratio of maximum to minimum over all processors
Mess: number of messages sent
Avg. len: average message length (bytes)
Reduct: number of global reductions
Global: entire computation
Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
%T - percent time in this phase %F - percent flop in this phase
%M - percent messages in this phase %L - percent message lengths in this phase
%R - percent reductions in this phase
Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over all processors)
------------------------------------------------------------------------------------------------------------------------
Event Count Time (sec) Flop --- Global --- --- Stage --- Total
Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------
--- Event Stage 0: Main Stage
MatMult 7470 1.0 4.7611e+01 1.9 2.11e+10 1.2 2.0e+08 2.9e+03 0.0e+00 11 53 81 86 0 11 53 81 86 0 827107
MatMultAdd 1150 1.0 3.8834e+00 3.5 8.06e+08 1.2 1.7e+07 5.7e+02 0.0e+00 1 2 7 1 0 1 2 7 1 0 388724
MatMultTranspose 1150 1.0 6.7493e+00 7.4 8.06e+08 1.2 1.7e+07 5.7e+02 0.0e+00 1 2 7 1 0 1 2 7 1 0 223663
MatSolve 230 0.0 8.3327e-04 0.0 6.35e+04 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 76
MatSOR 6955 1.0 2.9793e+01 2.8 1.41e+10 1.1 0.0e+00 0.0e+00 0.0e+00 9 37 0 0 0 9 37 0 0 0 912909
MatLUFactorSym 1 1.0 4.5509e-03561.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatLUFactorNum 1 1.0 3.5341e-031852.9 1.09e+03 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatScale 15 1.0 1.7186e-02 3.3 4.62e+06 1.2 1.4e+05 7.0e+02 0.0e+00 0 0 0 0 0 0 0 0 0 0 508009
MatResidual 1150 1.0 7.0952e+00 2.3 2.75e+09 1.2 3.1e+07 2.5e+03 0.0e+00 2 7 13 12 0 2 7 13 12 0 713964
MatAssemblyBegin 112 1.0 1.0418e+00 4.7 0.00e+00 0.0 4.3e+05 5.3e+04 7.4e+01 0 0 0 3 3 0 0 0 3 3 0
MatAssemblyEnd 112 1.0 5.9064e-01 1.1 0.00e+00 0.0 1.6e+06 2.4e+02 2.6e+02 0 0 1 0 10 0 0 1 0 10 0
MatGetRow 291670 1.0 3.9900e-02 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatGetRowIJ 1 0.0 4.3106e-04 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatCreateSubMat 6 1.0 4.7464e-02 1.0 0.00e+00 0.0 7.5e+04 2.0e+03 1.0e+02 0 0 0 0 4 0 0 0 0 4 0
MatGetOrdering 1 0.0 1.0009e-03 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatCoarsen 5 1.0 3.4372e-02 1.1 0.00e+00 0.0 3.0e+06 3.0e+02 5.9e+01 0 0 1 0 2 0 0 1 0 2 0
MatZeroEntries 5 1.0 5.3163e-03 3.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatView 90 1.3 6.1949e-01 5.2 0.00e+00 0.0 0.0e+00 0.0e+00 7.0e+01 0 0 0 0 3 0 0 0 0 3 0
MatAXPY 5 1.0 4.1116e-02 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+01 0 0 0 0 0 0 0 0 0 0 0
MatMatMult 5 1.0 4.7434e-01 1.2 7.18e+07 1.2 7.5e+05 7.0e+03 8.2e+01 0 0 0 1 3 0 0 0 1 3 278596
MatMatMultSym 5 1.0 2.8504e-01 1.0 0.00e+00 0.0 6.2e+05 5.3e+03 7.0e+01 0 0 0 0 3 0 0 0 0 3 0
MatMatMultNum 5 1.0 1.1035e-01 1.0 7.18e+07 1.2 1.4e+05 1.5e+04 1.0e+01 0 0 0 0 0 0 0 0 0 0 1197494
MatPtAP 5 1.0 2.6336e+00 1.0 8.35e+08 1.7 1.7e+06 3.0e+04 8.7e+01 1 2 1 7 3 1 2 1 7 3 472910
MatPtAPSymbolic 5 1.0 1.6345e+00 1.0 0.00e+00 0.0 7.2e+05 3.8e+04 3.7e+01 1 0 0 4 1 1 0 0 4 1 0
MatPtAPNumeric 5 1.0 1.0015e+00 1.0 8.35e+08 1.7 9.3e+05 2.3e+04 5.0e+01 0 2 0 3 2 0 2 0 3 2 1243604
MatTrnMatMult 1 1.0 8.1209e-01 1.0 2.97e+07 1.3 1.5e+05 5.0e+04 1.9e+01 0 0 0 1 1 0 0 0 1 1 69321
MatTrnMatMultSym 1 1.0 4.7897e-01 1.0 0.00e+00 0.0 1.3e+05 2.1e+04 1.7e+01 0 0 0 0 1 0 0 0 0 1 0
MatTrnMatMultNum 1 1.0 3.3517e-01 1.0 2.97e+07 1.3 2.4e+04 2.0e+05 2.0e+00 0 0 0 1 0 0 0 0 1 0 167958
MatGetLocalMat 17 1.0 3.8855e-02 2.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatGetBrAoCol 15 1.0 6.2124e-02 2.5 0.00e+00 0.0 9.5e+05 2.9e+04 0.0e+00 0 0 0 4 0 0 0 0 4 0 0
VecMDot 333 1.0 1.2028e+0113.7 3.44e+08 1.0 0.0e+00 0.0e+00 3.3e+02 1 1 0 0 13 1 1 0 0 13 56587
VecNorm 603 1.0 2.7685e+00 4.5 6.16e+07 1.0 0.0e+00 0.0e+00 6.0e+02 0 0 0 0 23 0 0 0 0 23 43942
VecScale 353 1.0 9.8841e-03 1.8 1.59e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 3172544
VecCopy 1410 1.0 7.7031e-02 1.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecSet 4468 1.0 5.7269e-02 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecAXPY 250 1.0 3.3906e-02 2.1 2.89e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1683301
VecAYPX 9440 1.0 3.6537e-01 2.9 1.59e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 854103
VecAXPBYCZ 4600 1.0 2.7121e-01 3.2 2.89e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 2092658
VecMAXPY 583 1.0 7.1103e-01 2.7 7.03e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 0 2 0 0 0 1955563
VecAssemblyBegin 185 1.0 1.1164e-01 1.4 0.00e+00 0.0 4.9e+04 1.4e+04 5.5e+02 0 0 0 0 21 0 0 0 0 21 0
VecAssemblyEnd 185 1.0 3.3379e-04 3.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecPointwiseMult 55 1.0 2.9728e-03 2.9 6.90e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 456523
VecScatterBegin 9954 1.0 1.0825e+00 7.3 0.00e+00 0.0 2.4e+08 2.5e+03 0.0e+00 0 0 97 90 0 0 0 97 90 0 0
VecScatterEnd 9954 1.0 3.8453e+0111.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 5 0 0 0 0 5 0 0 0 0 0
VecSetRandom 5 1.0 2.1403e-03 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecNormalize 113 1.0 7.4105e-02 6.0 4.68e+06 1.0 0.0e+00 0.0e+00 1.1e+02 0 0 0 0 4 0 0 0 0 4 124201
KSPGMRESOrthog 330 1.0 1.2168e+0110.7 6.86e+08 1.0 0.0e+00 0.0e+00 3.3e+02 1 2 0 0 12 1 2 0 0 13 111406
KSPSetUp 18 1.0 1.2172e-02 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 1.4e+01 0 0 0 0 1 0 0 0 0 1 0
KSPSolve 10 1.0 6.5991e+01 1.0 3.82e+10 1.2 2.4e+08 2.6e+03 8.2e+02 23 98 95 89 31 23 98 95 89 31 1098603
PCGAMGGraph_AGG 5 1.0 6.7798e-01 1.0 1.13e+06 1.2 6.8e+05 2.8e+02 1.3e+02 0 0 0 0 5 0 0 0 0 5 3197
PCGAMGCoarse_AGG 5 1.0 8.5740e-01 1.0 2.97e+07 1.3 3.3e+06 2.7e+03 9.4e+01 0 0 1 1 4 0 0 1 1 4 65658
PCGAMGProl_AGG 5 1.0 2.4710e-01 1.0 0.00e+00 0.0 4.8e+06 9.8e+02 9.0e+02 0 0 2 1 34 0 0 2 1 35 0
PCGAMGPOpt_AGG 5 1.0 7.5785e-01 1.0 2.12e+08 1.2 2.1e+06 4.1e+03 2.4e+02 0 1 1 1 9 0 1 1 1 9 518589
GAMG: createProl 5 1.0 2.5407e+00 1.0 2.43e+08 1.2 1.1e+07 2.1e+03 1.4e+03 1 1 4 3 51 1 1 4 3 52 177698
Graph 10 1.0 6.7570e-01 1.0 1.13e+06 1.2 6.8e+05 2.8e+02 1.3e+02 0 0 0 0 5 0 0 0 0 5 3208
MIS/Agg 5 1.0 3.4434e-02 1.1 0.00e+00 0.0 3.0e+06 3.0e+02 5.9e+01 0 0 1 0 2 0 0 1 0 2 0
SA: col data 5 1.0 1.3094e-01 1.0 0.00e+00 0.0 4.6e+06 8.2e+02 8.4e+02 0 0 2 1 31 0 0 2 1 32 0
SA: frmProl0 5 1.0 1.1028e-01 1.0 0.00e+00 0.0 1.9e+05 4.7e+03 5.0e+01 0 0 0 0 2 0 0 0 0 2 0
SA: smooth 5 1.0 5.2676e-01 1.2 7.53e+07 1.2 7.5e+05 7.0e+03 1.0e+02 0 0 0 1 4 0 0 0 1 4 263330
GAMG: partLevel 5 1.0 2.7087e+00 1.0 8.35e+08 1.7 1.7e+06 2.9e+04 2.5e+02 1 2 1 7 9 1 2 1 7 9 459805
repartition 3 1.0 5.6183e-03 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 1.8e+01 0 0 0 0 1 0 0 0 0 1 0
Invert-Sort 3 1.0 7.8020e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01 0 0 0 0 0 0 0 0 0 0 0
Move A 3 1.0 4.1104e-02 1.1 0.00e+00 0.0 3.2e+04 4.5e+03 5.4e+01 0 0 0 0 2 0 0 0 0 2 0
Move P 3 1.0 1.8200e-02 1.3 0.00e+00 0.0 4.3e+04 3.6e+01 5.4e+01 0 0 0 0 2 0 0 0 0 2 0
PCSetUp 2 1.0 5.2812e+00 1.0 1.08e+09 1.5 1.3e+07 5.7e+03 1.6e+03 2 2 5 11 62 2 2 5 11 63 321316
PCSetUpOnBlocks 230 1.0 6.2256e-0319.5 1.09e+03 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
PCApply 230 1.0 5.7271e+01 1.3 3.26e+10 1.2 2.2e+08 2.2e+03 1.0e+02 20 83 90 73 4 20 83 90 73 4 1074640
SFSetGraph 5 1.0 5.3167e-0555.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
SFBcastBegin 69 1.0 1.1146e-02 1.9 0.00e+00 0.0 3.0e+06 3.0e+02 0.0e+00 0 0 1 0 0 0 0 1 0 0 0
SFBcastEnd 69 1.0 7.9596e-03 7.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
BuildTwoSided 5 1.0 7.5631e-03 2.8 0.00e+00 0.0 6.9e+04 4.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
------------------------------------------------------------------------------------------------------------------------
Memory usage is given in bytes:
Object Type Creations Destructions Memory Descendants' Mem.
Reports information only for process 0.
--- Event Stage 0: Main Stage
Matrix 154 154 176666644 0.
Matrix Coarsen 5 5 3140 0.
Matrix Null Space 1 1 688 0.
Vector 1043 1043 297405224 0.
Vector Scatter 36 36 39456 0.
Index Set 112 112 395240 0.
Krylov Solver 18 18 330336 0.
Preconditioner 13 13 12868 0.
PetscRandom 10 10 6380 0.
Star Forest Graph 5 5 4280 0.
Viewer 12 11 9152 0.
========================================================================================================================
0 KSP unpreconditioned resid norm 3.738834778097e+08 true resid norm 3.738834778097e+08 ||r(i)||/||b|| 1.000000000000e+00
1 KSP unpreconditioned resid norm 1.256561415764e+08 true resid norm 1.256561415764e+08 ||r(i)||/||b|| 3.360836972859e-01
2 KSP unpreconditioned resid norm 1.843932942229e+07 true resid norm 1.843932942229e+07 ||r(i)||/||b|| 4.931838531703e-02
3 KSP unpreconditioned resid norm 6.189553415818e+06 true resid norm 6.189553415818e+06 ||r(i)||/||b|| 1.655476581120e-02
4 KSP unpreconditioned resid norm 2.614928212473e+06 true resid norm 2.614928212473e+06 ||r(i)||/||b|| 6.993965680944e-03
5 KSP unpreconditioned resid norm 1.208975553355e+06 true resid norm 1.208975553355e+06 ||r(i)||/||b|| 3.233562393388e-03
6 KSP unpreconditioned resid norm 5.481792905733e+05 true resid norm 5.481792905733e+05 ||r(i)||/||b|| 1.466176825423e-03
7 KSP unpreconditioned resid norm 2.526854282559e+05 true resid norm 2.526854282559e+05 ||r(i)||/||b|| 6.758400497828e-04
8 KSP unpreconditioned resid norm 1.150052500229e+05 true resid norm 1.150052500229e+05 ||r(i)||/||b|| 3.075965022488e-04
9 KSP unpreconditioned resid norm 5.289416146528e+04 true resid norm 5.289416146528e+04 ||r(i)||/||b|| 1.414723158540e-04
10 KSP unpreconditioned resid norm 2.495584369428e+04 true resid norm 2.495584369427e+04 ||r(i)||/||b|| 6.674765047246e-05
11 KSP unpreconditioned resid norm 1.184780633606e+04 true resid norm 1.184780633605e+04 ||r(i)||/||b|| 3.168849932994e-05
12 KSP unpreconditioned resid norm 5.709557885707e+03 true resid norm 5.709557885717e+03 ||r(i)||/||b|| 1.527095532321e-05
13 KSP unpreconditioned resid norm 2.811037623050e+03 true resid norm 2.811037623058e+03 ||r(i)||/||b|| 7.518485811476e-06
14 KSP unpreconditioned resid norm 1.399589249024e+03 true resid norm 1.399589249031e+03 ||r(i)||/||b|| 3.743383519460e-06
15 KSP unpreconditioned resid norm 6.919705622362e+02 true resid norm 6.919705622376e+02 ||r(i)||/||b|| 1.850765287333e-06
16 KSP unpreconditioned resid norm 3.469221128804e+02 true resid norm 3.469221128823e+02 ||r(i)||/||b|| 9.278883220907e-07
17 KSP unpreconditioned resid norm 1.747835577077e+02 true resid norm 1.747835577094e+02 ||r(i)||/||b|| 4.674813627318e-07
18 KSP unpreconditioned resid norm 8.648881836541e+01 true resid norm 8.648881835829e+01 ||r(i)||/||b|| 2.313255960519e-07
19 KSP unpreconditioned resid norm 4.247581916935e+01 true resid norm 4.247581916507e+01 ||r(i)||/||b|| 1.136071040472e-07
20 KSP unpreconditioned resid norm 2.086023330347e+01 true resid norm 2.086023330200e+01 ||r(i)||/||b|| 5.579340767933e-08
21 KSP unpreconditioned resid norm 1.023525173739e+01 true resid norm 1.023525174086e+01 ||r(i)||/||b|| 2.737551228746e-08
22 KSP unpreconditioned resid norm 4.963414450847e+00 true resid norm 4.963414447514e+00 ||r(i)||/||b|| 1.327529763174e-08
23 KSP unpreconditioned resid norm 2.415620601642e+00 true resid norm 2.415620604831e+00 ||r(i)||/||b|| 6.460891556327e-09
Linear solve converged due to CONVERGED_RTOL iterations 23
KSP Object: 2000 MPI processes
type: fgmres
restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
happy breakdown tolerance 1e-30
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-08, absolute=1e-50, divergence=10000.
right preconditioning
using UNPRECONDITIONED norm type for convergence test
PC Object: 2000 MPI processes
type: gamg
type is MULTIPLICATIVE, levels=6 cycles=v
Cycles per PCApply=1
Using externally compute Galerkin coarse grid matrices
GAMG specific options
Threshold for dropping small values in graph on each level = 0. 0. 0. 0.
Threshold scaling factor for each level not specified = 1.
AGG specific options
Symmetric graph false
Number of levels to square graph 1
Number smoothing steps 1
Coarse grid solver -- level -------------------------------
KSP Object: (mg_coarse_) 2000 MPI processes
type: preonly
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
left preconditioning
using NONE norm type for convergence test
PC Object: (mg_coarse_) 2000 MPI processes
type: bjacobi
number of blocks = 2000
Local solve is same for all blocks, in the following KSP and PC objects:
KSP Object: (mg_coarse_sub_) 1 MPI processes
type: preonly
maximum iterations=1, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
left preconditioning
using NONE norm type for convergence test
PC Object: (mg_coarse_sub_) 1 MPI processes
type: lu
out-of-place factorization
tolerance for zero pivot 2.22045e-14
using diagonal shift on blocks to prevent zero pivot [INBLOCKS]
matrix ordering: nd
factor fill ratio given 5., needed 1.
Factored matrix follows:
Mat Object: 1 MPI processes
type: seqaij
rows=12, cols=12, bs=6
package used to perform factorization: petsc
total: nonzeros=144, allocated nonzeros=144
total number of mallocs used during MatSetValues calls =0
using I-node routines: found 3 nodes, limit used is 5
linear system matrix = precond matrix:
Mat Object: 1 MPI processes
type: seqaij
rows=12, cols=12, bs=6
total: nonzeros=144, allocated nonzeros=144
total number of mallocs used during MatSetValues calls =0
using I-node routines: found 3 nodes, limit used is 5
linear system matrix = precond matrix:
Mat Object: 2000 MPI processes
type: mpiaij
rows=12, cols=12, bs=6
total: nonzeros=144, allocated nonzeros=144
total number of mallocs used during MatSetValues calls =0
using I-node (on process 0) routines: found 3 nodes, limit used is 5
Down solver (pre-smoother) on level 1 -------------------------------
KSP Object: (mg_levels_1_) 2000 MPI processes
type: chebyshev
eigenvalue estimates used: min = 0.0999937, max = 1.09993
eigenvalues estimate via gmres min 0.075342, max 0.999937
eigenvalues estimated using gmres with translations [0. 0.1; 0. 1.1]
KSP Object: (mg_levels_1_esteig_) 2000 MPI processes
type: gmres
restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
happy breakdown tolerance 1e-30
maximum iterations=10, initial guess is zero
tolerances: relative=1e-12, absolute=1e-50, divergence=10000.
left preconditioning
using PRECONDITIONED norm type for convergence test
estimating eigenvalues using noisy right hand side
maximum iterations=2, nonzero initial guess
tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
left preconditioning
using NONE norm type for convergence test
PC Object: (mg_levels_1_) 2000 MPI processes
type: sor
type = local_symmetric, iterations = 1, local iterations = 1, omega = 1.
linear system matrix = precond matrix:
Mat Object: 2000 MPI processes
type: mpiaij
rows=318, cols=318, bs=6
total: nonzeros=90828, allocated nonzeros=90828
total number of mallocs used during MatSetValues calls =0
using I-node (on process 0) routines: found 87 nodes, limit used is 5
Up solver (post-smoother) same as down solver (pre-smoother)
Down solver (pre-smoother) on level 2 -------------------------------
KSP Object: (mg_levels_2_) 2000 MPI processes
type: chebyshev
eigenvalue estimates used: min = 0.130639, max = 1.43703
eigenvalues estimate via gmres min 0.077106, max 1.30639
eigenvalues estimated using gmres with translations [0. 0.1; 0. 1.1]
KSP Object: (mg_levels_2_esteig_) 2000 MPI processes
type: gmres
restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
happy breakdown tolerance 1e-30
maximum iterations=10, initial guess is zero
tolerances: relative=1e-12, absolute=1e-50, divergence=10000.
left preconditioning
using PRECONDITIONED norm type for convergence test
estimating eigenvalues using noisy right hand side
maximum iterations=2, nonzero initial guess
tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
left preconditioning
using NONE norm type for convergence test
PC Object: (mg_levels_2_) 2000 MPI processes
type: sor
type = local_symmetric, iterations = 1, local iterations = 1, omega = 1.
linear system matrix = precond matrix:
Mat Object: 2000 MPI processes
type: mpiaij
rows=9870, cols=9870, bs=6
total: nonzeros=11941884, allocated nonzeros=11941884
total number of mallocs used during MatSetValues calls =0
using I-node (on process 0) routines: found 9 nodes, limit used is 5
Up solver (post-smoother) same as down solver (pre-smoother)
Down solver (pre-smoother) on level 3 -------------------------------
KSP Object: (mg_levels_3_) 2000 MPI processes
type: chebyshev
eigenvalue estimates used: min = 0.151779, max = 1.66957
eigenvalues estimate via gmres min 0.352485, max 1.51779
eigenvalues estimated using gmres with translations [0. 0.1; 0. 1.1]
KSP Object: (mg_levels_3_esteig_) 2000 MPI processes
type: gmres
restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
happy breakdown tolerance 1e-30
maximum iterations=10, initial guess is zero
tolerances: relative=1e-12, absolute=1e-50, divergence=10000.
left preconditioning
using PRECONDITIONED norm type for convergence test
estimating eigenvalues using noisy right hand side
maximum iterations=2, nonzero initial guess
tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
left preconditioning
using NONE norm type for convergence test
PC Object: (mg_levels_3_) 2000 MPI processes
type: sor
type = local_symmetric, iterations = 1, local iterations = 1, omega = 1.
linear system matrix = precond matrix:
Mat Object: 2000 MPI processes
type: mpiaij
rows=334476, cols=334476, bs=6
total: nonzeros=292009536, allocated nonzeros=292009536
total number of mallocs used during MatSetValues calls =0
using nonscalable MatPtAP() implementation
using I-node (on process 0) routines: found 50 nodes, limit used is 5
Up solver (post-smoother) same as down solver (pre-smoother)
Down solver (pre-smoother) on level 4 -------------------------------
KSP Object: (mg_levels_4_) 2000 MPI processes
type: chebyshev
eigenvalue estimates used: min = 0.181248, max = 1.99372
eigenvalues estimate via gmres min 0.141976, max 1.81248
eigenvalues estimated using gmres with translations [0. 0.1; 0. 1.1]
KSP Object: (mg_levels_4_esteig_) 2000 MPI processes
type: gmres
restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
happy breakdown tolerance 1e-30
maximum iterations=10, initial guess is zero
tolerances: relative=1e-12, absolute=1e-50, divergence=10000.
left preconditioning
using PRECONDITIONED norm type for convergence test
estimating eigenvalues using noisy right hand side
maximum iterations=2, nonzero initial guess
tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
left preconditioning
using NONE norm type for convergence test
PC Object: (mg_levels_4_) 2000 MPI processes
type: sor
type = local_symmetric, iterations = 1, local iterations = 1, omega = 1.
linear system matrix = precond matrix:
Mat Object: 2000 MPI processes
type: mpiaij
rows=5160228, cols=5160228, bs=6
total: nonzeros=1375082208, allocated nonzeros=1375082208
total number of mallocs used during MatSetValues calls =0
using nonscalable MatPtAP() implementation
using I-node (on process 0) routines: found 792 nodes, limit used is 5
Up solver (post-smoother) same as down solver (pre-smoother)
Down solver (pre-smoother) on level 5 -------------------------------
KSP Object: (mg_levels_5_) 2000 MPI processes
type: chebyshev
eigenvalue estimates used: min = 0.23761, max = 2.61371
eigenvalues estimate via gmres min 0.0632228, max 2.3761
eigenvalues estimated using gmres with translations [0. 0.1; 0. 1.1]
KSP Object: (mg_levels_5_esteig_) 2000 MPI processes
type: gmres
restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
happy breakdown tolerance 1e-30
maximum iterations=10, initial guess is zero
tolerances: relative=1e-12, absolute=1e-50, divergence=10000.
left preconditioning
using PRECONDITIONED norm type for convergence test
estimating eigenvalues using noisy right hand side
maximum iterations=2, nonzero initial guess
tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
left preconditioning
using NONE norm type for convergence test
PC Object: (mg_levels_5_) 2000 MPI processes
type: sor
type = local_symmetric, iterations = 1, local iterations = 1, omega = 1.
linear system matrix = precond matrix:
Mat Object: 2000 MPI processes
type: mpiaij
rows=117874305, cols=117874305, bs=3
total: nonzeros=9333251991, allocated nonzeros=9333251991
total number of mallocs used during MatSetValues calls =0
has attached near null space
using I-node (on process 0) routines: found 19690 nodes, limit used is 5
Up solver (post-smoother) same as down solver (pre-smoother)
linear system matrix = precond matrix:
Mat Object: 2000 MPI processes
type: mpiaij
rows=117874305, cols=117874305, bs=3
total: nonzeros=9333251991, allocated nonzeros=9333251991
total number of mallocs used during MatSetValues calls =0
has attached near null space
using I-node (on process 0) routines: found 19690 nodes, limit used is 5
-------------- next part --------------
---------------------------------------------- PETSc Performance Summary: ----------------------------------------------
Unknown Name on a arch-linux2-c-opt-mpi-ml-hypre named eocn0055 with 1500 processors, by B07947 Thu Nov 15 15:55:02 2018
Using Petsc Release Version 3.8.2, Nov, 09, 2017
Max Max/Min Avg Total
Time (sec): 2.296e+02 1.00007 2.296e+02
Objects: 1.409e+03 1.00142 1.407e+03
Flop: 5.219e+10 1.14806 4.965e+10 7.447e+13
Flop/sec: 2.273e+08 1.14806 2.162e+08 3.243e+11
MPI Messages: 4.774e+05 14.16274 1.262e+05 1.893e+08
MPI Message Lengths: 7.718e+08 4.12637 3.102e+03 5.872e+11
MPI Reductions: 2.667e+03 1.00000
Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
e.g., VecAXPY() for real vectors of length N --> 2N flop
and VecAXPY() for complex vectors of length N --> 8N flop
Summary of Stages: ----- Time ------ ----- Flop ----- --- Messages --- -- Message Lengths -- -- Reductions --
Avg %Total Avg %Total counts %Total Avg %Total counts %Total
0: Main Stage: 2.2961e+02 100.0% 7.4472e+13 100.0% 1.892e+08 99.9% 3.099e+03 99.9% 2.635e+03 98.8%
------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
Count: number of times phase was executed
Time and Flop: Max - maximum over all processors
Ratio - ratio of maximum to minimum over all processors
Mess: number of messages sent
Avg. len: average message length (bytes)
Reduct: number of global reductions
Global: entire computation
Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
%T - percent time in this phase %F - percent flop in this phase
%M - percent messages in this phase %L - percent message lengths in this phase
%R - percent reductions in this phase
Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over all processors)
------------------------------------------------------------------------------------------------------------------------
Event Count Time (sec) Flop --- Global --- --- Stage --- Total
Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------
--- Event Stage 0: Main Stage
MatMult 7470 1.0 6.1871e+01 1.8 2.79e+10 1.2 1.5e+08 3.3e+03 0.0e+00 18 53 81 86 0 18 53 81 86 0 636062
MatMultAdd 1150 1.0 4.4228e+00 3.1 1.07e+09 1.2 1.3e+07 6.4e+02 0.0e+00 1 2 7 1 0 1 2 7 1 0 340660
MatMultTranspose 1150 1.0 5.8074e+00 4.5 1.07e+09 1.2 1.3e+07 6.4e+02 0.0e+00 1 2 7 1 0 1 2 7 1 0 259436
MatSolve 230 0.0 7.8106e-04 0.0 6.35e+04 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 81
MatSOR 6955 1.0 4.0051e+01 2.6 1.90e+10 1.1 0.0e+00 0.0e+00 0.0e+00 15 37 0 0 0 15 37 0 0 0 686820
MatLUFactorSym 1 1.0 1.9209e-03175.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatLUFactorNum 1 1.0 1.7691e-03927.5 1.09e+03 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1
MatScale 15 1.0 2.3391e-02 4.6 6.13e+06 1.1 1.0e+05 8.0e+02 0.0e+00 0 0 0 0 0 0 0 0 0 0 372687
MatResidual 1150 1.0 9.1807e+00 2.2 3.63e+09 1.2 2.4e+07 2.8e+03 0.0e+00 2 7 13 12 0 2 7 13 12 0 551315
MatAssemblyBegin 112 1.0 9.0080e-01 2.5 0.00e+00 0.0 3.2e+05 6.2e+04 7.4e+01 0 0 0 3 3 0 0 0 3 3 0
MatAssemblyEnd 112 1.0 6.8422e-01 1.1 0.00e+00 0.0 1.2e+06 2.7e+02 2.6e+02 0 0 1 0 10 0 0 1 0 10 0
MatGetRow 388852 1.0 5.5644e-02 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatGetRowIJ 1 0.0 1.6968e-03 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatCreateSubMat 6 1.0 6.8178e-02 1.0 0.00e+00 0.0 8.2e+04 1.8e+03 1.0e+02 0 0 0 0 4 0 0 0 0 4 0
MatGetOrdering 1 0.0 1.8709e-03 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatCoarsen 5 1.0 3.8623e-02 1.1 0.00e+00 0.0 2.7e+06 2.9e+02 7.2e+01 0 0 1 0 3 0 0 1 0 3 0
MatZeroEntries 5 1.0 6.3353e-03 2.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatView 90 1.3 6.1051e-01 5.3 0.00e+00 0.0 0.0e+00 0.0e+00 7.0e+01 0 0 0 0 3 0 0 0 0 3 0
MatAXPY 5 1.0 6.4298e-02 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+01 0 0 0 0 0 0 0 0 0 0 0
MatMatMult 5 1.0 5.4698e-01 1.0 9.46e+07 1.2 5.7e+05 8.1e+03 8.2e+01 0 0 0 1 3 0 0 0 1 3 241395
MatMatMultSym 5 1.0 3.6737e-01 1.0 0.00e+00 0.0 4.6e+05 6.1e+03 7.0e+01 0 0 0 0 3 0 0 0 0 3 0
MatMatMultNum 5 1.0 1.7525e-01 1.0 9.46e+07 1.2 1.0e+05 1.7e+04 1.0e+01 0 0 0 0 0 0 0 0 0 0 753412
MatPtAP 5 1.0 3.4278e+00 1.0 1.10e+09 1.6 1.2e+06 3.4e+04 8.7e+01 1 2 1 7 3 1 2 1 7 3 361157
MatPtAPSymbolic 5 1.0 2.2084e+00 1.0 0.00e+00 0.0 5.4e+05 4.4e+04 3.7e+01 1 0 0 4 1 1 0 0 4 1 0
MatPtAPNumeric 5 1.0 1.2233e+00 1.0 1.10e+09 1.6 7.0e+05 2.7e+04 5.0e+01 1 2 0 3 2 1 2 0 3 2 1011960
MatTrnMatMult 1 1.0 1.0668e+00 1.0 3.95e+07 1.3 1.1e+05 6.0e+04 1.9e+01 0 0 0 1 1 0 0 0 1 1 52637
MatTrnMatMultSym 1 1.0 6.2306e-01 1.0 0.00e+00 0.0 9.0e+04 2.5e+04 1.7e+01 0 0 0 0 1 0 0 0 0 1 0
MatTrnMatMultNum 1 1.0 4.4524e-01 1.0 3.95e+07 1.3 1.8e+04 2.4e+05 2.0e+00 0 0 0 1 0 0 0 0 1 0 126116
MatGetLocalMat 17 1.0 5.2980e-02 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatGetBrAoCol 15 1.0 7.0928e-02 2.7 0.00e+00 0.0 7.2e+05 3.2e+04 0.0e+00 0 0 0 4 0 0 0 0 4 0 0
VecMDot 333 1.0 6.9139e+00 5.8 4.59e+08 1.0 0.0e+00 0.0e+00 3.3e+02 1 1 0 0 12 1 1 0 0 13 98444
VecNorm 603 1.0 3.4630e+00 7.2 8.20e+07 1.0 0.0e+00 0.0e+00 6.0e+02 0 0 0 0 23 0 0 0 0 23 35129
VecScale 353 1.0 5.6857e-02 5.8 2.11e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 551520
VecCopy 1410 1.0 1.0825e-01 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecSet 4468 1.0 8.2095e-02 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecAXPY 250 1.0 5.0242e-02 2.3 3.85e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1135966
VecAYPX 9440 1.0 5.1576e-01 2.6 2.11e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 605004
VecAXPBYCZ 4600 1.0 3.6595e-01 2.7 3.85e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 1550752
VecMAXPY 583 1.0 1.0786e+00 3.4 9.38e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 0 2 0 0 0 1289187
VecAssemblyBegin 185 1.0 7.6495e-02 1.2 0.00e+00 0.0 3.5e+04 1.6e+04 5.5e+02 0 0 0 0 20 0 0 0 0 21 0
VecAssemblyEnd 185 1.0 3.8767e-04 3.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecPointwiseMult 55 1.0 4.6344e-03 2.8 9.20e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 292821
VecScatterBegin 9954 1.0 1.1589e+00 7.6 0.00e+00 0.0 1.8e+08 2.9e+03 0.0e+00 0 0 97 90 0 0 0 97 90 0 0
VecScatterEnd 9954 1.0 4.8668e+0111.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 6 0 0 0 0 6 0 0 0 0 0
VecSetRandom 5 1.0 2.8229e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecNormalize 113 1.0 1.3515e-01 3.0 6.23e+06 1.0 0.0e+00 0.0e+00 1.1e+02 0 0 0 0 4 0 0 0 0 4 68095
KSPGMRESOrthog 330 1.0 7.1848e+00 4.6 9.14e+08 1.0 0.0e+00 0.0e+00 3.3e+02 1 2 0 0 12 1 2 0 0 13 188679
KSPSetUp 18 1.0 1.2331e-02 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 1.4e+01 0 0 0 0 1 0 0 0 0 1 0
KSPSolve 10 1.0 8.7217e+01 1.0 5.09e+10 1.1 1.8e+08 2.9e+03 8.2e+02 38 98 94 89 31 38 98 94 89 31 834427
PCGAMGGraph_AGG 5 1.0 8.6509e-01 1.0 1.50e+06 1.2 5.2e+05 3.2e+02 1.3e+02 0 0 0 0 5 0 0 0 0 5 2505
PCGAMGCoarse_AGG 5 1.0 1.1177e+00 1.0 3.95e+07 1.3 2.9e+06 2.7e+03 1.1e+02 0 0 2 1 4 0 0 2 1 4 50240
PCGAMGProl_AGG 5 1.0 3.2632e-01 1.0 0.00e+00 0.0 3.7e+06 1.1e+03 9.0e+02 0 0 2 1 34 0 0 2 1 34 0
PCGAMGPOpt_AGG 5 1.0 9.1948e-01 1.0 2.80e+08 1.2 1.6e+06 4.7e+03 2.4e+02 0 1 1 1 9 0 1 1 1 9 427090
GAMG: createProl 5 1.0 3.2296e+00 1.0 3.20e+08 1.2 8.7e+06 2.3e+03 1.4e+03 1 1 5 3 52 1 1 5 3 52 139652
Graph 10 1.0 8.6263e-01 1.0 1.50e+06 1.2 5.2e+05 3.2e+02 1.3e+02 0 0 0 0 5 0 0 0 0 5 2512
MIS/Agg 5 1.0 3.8683e-02 1.1 0.00e+00 0.0 2.7e+06 2.9e+02 7.2e+01 0 0 1 0 3 0 0 1 0 3 0
SA: col data 5 1.0 1.6884e-01 1.0 0.00e+00 0.0 3.5e+06 9.4e+02 8.4e+02 0 0 2 1 31 0 0 2 1 32 0
SA: frmProl0 5 1.0 1.5309e-01 1.0 0.00e+00 0.0 1.4e+05 5.5e+03 5.0e+01 0 0 0 0 2 0 0 0 0 2 0
SA: smooth 5 1.0 6.2292e-01 1.0 9.92e+07 1.2 5.7e+05 8.1e+03 1.0e+02 0 0 0 1 4 0 0 0 1 4 222484
GAMG: partLevel 5 1.0 3.5234e+00 1.0 1.10e+09 1.6 1.3e+06 3.2e+04 2.5e+02 2 2 1 7 9 2 2 1 7 9 351357
repartition 3 1.0 4.2057e-03 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 1.8e+01 0 0 0 0 1 0 0 0 0 1 0
Invert-Sort 3 1.0 3.6008e-03 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01 0 0 0 0 0 0 0 0 0 0 0
Move A 3 1.0 5.2288e-02 1.1 0.00e+00 0.0 4.3e+04 3.4e+03 5.4e+01 0 0 0 0 2 0 0 0 0 2 0
Move P 3 1.0 2.8588e-02 1.2 0.00e+00 0.0 3.8e+04 3.3e+01 5.4e+01 0 0 0 0 2 0 0 0 0 2 0
PCSetUp 2 1.0 6.7734e+00 1.0 1.42e+09 1.5 1.0e+07 6.2e+03 1.7e+03 3 2 5 11 62 3 2 5 11 63 249358
PCSetUpOnBlocks 230 1.0 3.0496e-03 6.5 1.09e+03 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
PCApply 230 1.0 7.5882e+01 1.1 4.35e+10 1.2 1.7e+08 2.5e+03 1.0e+02 32 83 90 73 4 32 83 90 73 4 814742
SFSetGraph 5 1.0 2.3127e-0524.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
SFBcastBegin 82 1.0 1.2156e-02 1.9 0.00e+00 0.0 2.7e+06 2.9e+02 0.0e+00 0 0 1 0 0 0 0 1 0 0 0
SFBcastEnd 82 1.0 9.9132e-03 9.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
BuildTwoSided 5 1.0 7.6580e-03 2.6 0.00e+00 0.0 5.2e+04 4.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
------------------------------------------------------------------------------------------------------------------------
Memory usage is given in bytes:
Object Type Creations Destructions Memory Descendants' Mem.
Reports information only for process 0.
--- Event Stage 0: Main Stage
Matrix 154 154 245066064 0.
Matrix Coarsen 5 5 3140 0.
Matrix Null Space 1 1 688 0.
Vector 1043 1043 395730096 0.
Vector Scatter 36 36 39456 0.
Index Set 112 112 566408 0.
Krylov Solver 18 18 330336 0.
Preconditioner 13 13 12868 0.
PetscRandom 10 10 6380 0.
Star Forest Graph 5 5 4280 0.
Viewer 12 11 9152 0.
========================================================================================================================
0 KSP unpreconditioned resid norm 3.738834778994e+08 true resid norm 3.738834778994e+08 ||r(i)||/||b|| 1.000000000000e+00
1 KSP unpreconditioned resid norm 1.279113569974e+08 true resid norm 1.279113569974e+08 ||r(i)||/||b|| 3.421155642289e-01
2 KSP unpreconditioned resid norm 1.874944207644e+07 true resid norm 1.874944207644e+07 ||r(i)||/||b|| 5.014782194118e-02
3 KSP unpreconditioned resid norm 6.305464086727e+06 true resid norm 6.305464086727e+06 ||r(i)||/||b|| 1.686478397536e-02
4 KSP unpreconditioned resid norm 2.648974672476e+06 true resid norm 2.648974672476e+06 ||r(i)||/||b|| 7.085027365634e-03
5 KSP unpreconditioned resid norm 1.239886218685e+06 true resid norm 1.239886218685e+06 ||r(i)||/||b|| 3.316236988195e-03
6 KSP unpreconditioned resid norm 5.641563718944e+05 true resid norm 5.641563718944e+05 ||r(i)||/||b|| 1.508909607517e-03
7 KSP unpreconditioned resid norm 2.606746938444e+05 true resid norm 2.606746938444e+05 ||r(i)||/||b|| 6.972083797577e-04
8 KSP unpreconditioned resid norm 1.184535518381e+05 true resid norm 1.184535518381e+05 ||r(i)||/||b|| 3.168194339682e-04
9 KSP unpreconditioned resid norm 5.392667623794e+04 true resid norm 5.392667623794e+04 ||r(i)||/||b|| 1.442339108990e-04
10 KSP unpreconditioned resid norm 2.520203694105e+04 true resid norm 2.520203694106e+04 ||r(i)||/||b|| 6.740612632217e-05
11 KSP unpreconditioned resid norm 1.185967319435e+04 true resid norm 1.185967319434e+04 ||r(i)||/||b|| 3.172023877859e-05
12 KSP unpreconditioned resid norm 5.627359926956e+03 true resid norm 5.627359926969e+03 ||r(i)||/||b|| 1.505110618577e-05
13 KSP unpreconditioned resid norm 2.702021069922e+03 true resid norm 2.702021069923e+03 ||r(i)||/||b|| 7.226906856392e-06
14 KSP unpreconditioned resid norm 1.307500233445e+03 true resid norm 1.307500233448e+03 ||r(i)||/||b|| 3.497079466561e-06
15 KSP unpreconditioned resid norm 6.250158790292e+02 true resid norm 6.250158790312e+02 ||r(i)||/||b|| 1.671686276545e-06
16 KSP unpreconditioned resid norm 3.038680168367e+02 true resid norm 3.038680168345e+02 ||r(i)||/||b|| 8.127345410977e-07
17 KSP unpreconditioned resid norm 1.504350436399e+02 true resid norm 1.504350436436e+02 ||r(i)||/||b|| 4.023580942618e-07
18 KSP unpreconditioned resid norm 7.388944694136e+01 true resid norm 7.388944694645e+01 ||r(i)||/||b|| 1.976269381081e-07
19 KSP unpreconditioned resid norm 3.596911660459e+01 true resid norm 3.596911660288e+01 ||r(i)||/||b|| 9.620408156298e-08
20 KSP unpreconditioned resid norm 1.769248937152e+01 true resid norm 1.769248936529e+01 ||r(i)||/||b|| 4.732086441662e-08
21 KSP unpreconditioned resid norm 8.746482066795e+00 true resid norm 8.746482056876e+00 ||r(i)||/||b|| 2.339360408761e-08
22 KSP unpreconditioned resid norm 4.283455600167e+00 true resid norm 4.283455596172e+00 ||r(i)||/||b|| 1.145665922506e-08
23 KSP unpreconditioned resid norm 2.096047551274e+00 true resid norm 2.096047547699e+00 ||r(i)||/||b|| 5.606151840341e-09
Linear solve converged due to CONVERGED_RTOL iterations 23
KSP Object: 1500 MPI processes
type: fgmres
restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
happy breakdown tolerance 1e-30
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-08, absolute=1e-50, divergence=10000.
right preconditioning
using UNPRECONDITIONED norm type for convergence test
PC Object: 1500 MPI processes
type: gamg
type is MULTIPLICATIVE, levels=6 cycles=v
Cycles per PCApply=1
Using externally compute Galerkin coarse grid matrices
GAMG specific options
Threshold for dropping small values in graph on each level = 0. 0. 0. 0.
Threshold scaling factor for each level not specified = 1.
AGG specific options
Symmetric graph false
Number of levels to square graph 1
Number smoothing steps 1
Coarse grid solver -- level -------------------------------
KSP Object: (mg_coarse_) 1500 MPI processes
type: preonly
maximum iterations=10000, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
left preconditioning
using NONE norm type for convergence test
PC Object: (mg_coarse_) 1500 MPI processes
type: bjacobi
number of blocks = 1500
Local solve is same for all blocks, in the following KSP and PC objects:
KSP Object: (mg_coarse_sub_) 1 MPI processes
type: preonly
maximum iterations=1, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
left preconditioning
using NONE norm type for convergence test
PC Object: (mg_coarse_sub_) 1 MPI processes
type: lu
out-of-place factorization
tolerance for zero pivot 2.22045e-14
using diagonal shift on blocks to prevent zero pivot [INBLOCKS]
matrix ordering: nd
factor fill ratio given 5., needed 1.
Factored matrix follows:
Mat Object: 1 MPI processes
type: seqaij
rows=12, cols=12, bs=6
package used to perform factorization: petsc
total: nonzeros=144, allocated nonzeros=144
total number of mallocs used during MatSetValues calls =0
using I-node routines: found 3 nodes, limit used is 5
linear system matrix = precond matrix:
Mat Object: 1 MPI processes
type: seqaij
rows=12, cols=12, bs=6
total: nonzeros=144, allocated nonzeros=144
total number of mallocs used during MatSetValues calls =0
using I-node routines: found 3 nodes, limit used is 5
linear system matrix = precond matrix:
Mat Object: 1500 MPI processes
type: mpiaij
rows=12, cols=12, bs=6
total: nonzeros=144, allocated nonzeros=144
total number of mallocs used during MatSetValues calls =0
using I-node (on process 0) routines: found 3 nodes, limit used is 5
Down solver (pre-smoother) on level 1 -------------------------------
KSP Object: (mg_levels_1_) 1500 MPI processes
type: chebyshev
eigenvalue estimates used: min = 0.0999807, max = 1.09979
eigenvalues estimate via gmres min 0.310311, max 0.999807
eigenvalues estimated using gmres with translations [0. 0.1; 0. 1.1]
KSP Object: (mg_levels_1_esteig_) 1500 MPI processes
type: gmres
restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
happy breakdown tolerance 1e-30
maximum iterations=10, initial guess is zero
tolerances: relative=1e-12, absolute=1e-50, divergence=10000.
left preconditioning
using PRECONDITIONED norm type for convergence test
estimating eigenvalues using noisy right hand side
maximum iterations=2, nonzero initial guess
tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
left preconditioning
using NONE norm type for convergence test
PC Object: (mg_levels_1_) 1500 MPI processes
type: sor
type = local_symmetric, iterations = 1, local iterations = 1, omega = 1.
linear system matrix = precond matrix:
Mat Object: 1500 MPI processes
type: mpiaij
rows=312, cols=312, bs=6
total: nonzeros=90792, allocated nonzeros=90792
total number of mallocs used during MatSetValues calls =0
using I-node (on process 0) routines: found 87 nodes, limit used is 5
Up solver (post-smoother) same as down solver (pre-smoother)
Down solver (pre-smoother) on level 2 -------------------------------
KSP Object: (mg_levels_2_) 1500 MPI processes
type: chebyshev
eigenvalue estimates used: min = 0.128747, max = 1.41622
eigenvalues estimate via gmres min 0.191833, max 1.28747
eigenvalues estimated using gmres with translations [0. 0.1; 0. 1.1]
KSP Object: (mg_levels_2_esteig_) 1500 MPI processes
type: gmres
restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
happy breakdown tolerance 1e-30
maximum iterations=10, initial guess is zero
tolerances: relative=1e-12, absolute=1e-50, divergence=10000.
left preconditioning
using PRECONDITIONED norm type for convergence test
estimating eigenvalues using noisy right hand side
maximum iterations=2, nonzero initial guess
tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
left preconditioning
using NONE norm type for convergence test
PC Object: (mg_levels_2_) 1500 MPI processes
type: sor
type = local_symmetric, iterations = 1, local iterations = 1, omega = 1.
linear system matrix = precond matrix:
Mat Object: 1500 MPI processes
type: mpiaij
rows=9990, cols=9990, bs=6
total: nonzeros=11862180, allocated nonzeros=11862180
total number of mallocs used during MatSetValues calls =0
using I-node (on process 0) routines: found 6 nodes, limit used is 5
Up solver (post-smoother) same as down solver (pre-smoother)
Down solver (pre-smoother) on level 3 -------------------------------
KSP Object: (mg_levels_3_) 1500 MPI processes
type: chebyshev
eigenvalue estimates used: min = 0.149515, max = 1.64466
eigenvalues estimate via gmres min 0.342896, max 1.49515
eigenvalues estimated using gmres with translations [0. 0.1; 0. 1.1]
KSP Object: (mg_levels_3_esteig_) 1500 MPI processes
type: gmres
restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
happy breakdown tolerance 1e-30
maximum iterations=10, initial guess is zero
tolerances: relative=1e-12, absolute=1e-50, divergence=10000.
left preconditioning
using PRECONDITIONED norm type for convergence test
estimating eigenvalues using noisy right hand side
maximum iterations=2, nonzero initial guess
tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
left preconditioning
using NONE norm type for convergence test
PC Object: (mg_levels_3_) 1500 MPI processes
type: sor
type = local_symmetric, iterations = 1, local iterations = 1, omega = 1.
linear system matrix = precond matrix:
Mat Object: 1500 MPI processes
type: mpiaij
rows=333960, cols=333960, bs=6
total: nonzeros=289654416, allocated nonzeros=289654416
total number of mallocs used during MatSetValues calls =0
using nonscalable MatPtAP() implementation
using I-node (on process 0) routines: found 39 nodes, limit used is 5
Up solver (post-smoother) same as down solver (pre-smoother)
Down solver (pre-smoother) on level 4 -------------------------------
KSP Object: (mg_levels_4_) 1500 MPI processes
type: chebyshev
eigenvalue estimates used: min = 0.173537, max = 1.90891
eigenvalues estimate via gmres min 0.143849, max 1.73537
eigenvalues estimated using gmres with translations [0. 0.1; 0. 1.1]
KSP Object: (mg_levels_4_esteig_) 1500 MPI processes
type: gmres
restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
happy breakdown tolerance 1e-30
maximum iterations=10, initial guess is zero
tolerances: relative=1e-12, absolute=1e-50, divergence=10000.
left preconditioning
using PRECONDITIONED norm type for convergence test
estimating eigenvalues using noisy right hand side
maximum iterations=2, nonzero initial guess
tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
left preconditioning
using NONE norm type for convergence test
PC Object: (mg_levels_4_) 1500 MPI processes
type: sor
type = local_symmetric, iterations = 1, local iterations = 1, omega = 1.
linear system matrix = precond matrix:
Mat Object: 1500 MPI processes
type: mpiaij
rows=5149116, cols=5149116, bs=6
total: nonzeros=1368332496, allocated nonzeros=1368332496
total number of mallocs used during MatSetValues calls =0
using nonscalable MatPtAP() implementation
using I-node (on process 0) routines: found 976 nodes, limit used is 5
Up solver (post-smoother) same as down solver (pre-smoother)
Down solver (pre-smoother) on level 5 -------------------------------
KSP Object: (mg_levels_5_) 1500 MPI processes
type: chebyshev
eigenvalue estimates used: min = 0.241719, max = 2.65891
eigenvalues estimate via gmres min 0.0638427, max 2.41719
eigenvalues estimated using gmres with translations [0. 0.1; 0. 1.1]
KSP Object: (mg_levels_5_esteig_) 1500 MPI processes
type: gmres
restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
happy breakdown tolerance 1e-30
maximum iterations=10, initial guess is zero
tolerances: relative=1e-12, absolute=1e-50, divergence=10000.
left preconditioning
using PRECONDITIONED norm type for convergence test
estimating eigenvalues using noisy right hand side
maximum iterations=2, nonzero initial guess
tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
left preconditioning
using NONE norm type for convergence test
PC Object: (mg_levels_5_) 1500 MPI processes
type: sor
type = local_symmetric, iterations = 1, local iterations = 1, omega = 1.
linear system matrix = precond matrix:
Mat Object: 1500 MPI processes
type: mpiaij
rows=117874305, cols=117874305, bs=3
total: nonzeros=9333251991, allocated nonzeros=9333251991
total number of mallocs used during MatSetValues calls =0
has attached near null space
using I-node (on process 0) routines: found 26223 nodes, limit used is 5
Up solver (post-smoother) same as down solver (pre-smoother)
linear system matrix = precond matrix:
Mat Object: 1500 MPI processes
type: mpiaij
rows=117874305, cols=117874305, bs=3
total: nonzeros=9333251991, allocated nonzeros=9333251991
total number of mallocs used during MatSetValues calls =0
has attached near null space
using I-node (on process 0) routines: found 26223 nodes, limit used is 5
More information about the petsc-users
mailing list