Hi All,<br><br>In my log_summary output, I found that nearly 80% of the total time is spent on KSPGMRESOrthog. I think this does not make sense ( the log_summary output followed). Who has any idea about this?<br><br>Another question, I am using the two-level asm precondtioner. On the coarse level I use one-level asm preconditioned GMRES to solve a coarse problem. So both the fine level solver and coarse level solver call the function KSPGMRESOrthog. In the log_summary output, I just know the total time spent on KSPGMRESOrthog and how can I know how much time is spent on the coarse level KSPGMRESOrthog and how much is spent on fine level KSPGMRESOrthog? Thanks.<br>
<br>Best,<br>Rongliang<br><br><br>------------------------------------------------------------------------------------------------------------------------<br>************************************************************************************************************************<br>
*** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document ***<br>************************************************************************************************************************<br>
<br>---------------------------------------------- PETSc Performance Summary: ----------------------------------------------<br><br>./joab on a Janus-nod named node1777 with 1024 processors, by ronglian Thu Nov 17 00:32:04 2011<br>
Using Petsc Release Version 3.2.0, Patch 4, Sun Oct 23 12:23:18 CDT 2011 <br><br> Max Max/Min Avg Total <br>Time (sec): 1.162e+03 1.00001 1.162e+03<br>Objects: 6.094e+03 1.00099 6.090e+03<br>
Flops: 6.284e+11 81.61246 4.097e+10 4.195e+13<br>Flops/sec: 5.410e+08 81.61201 3.527e+07 3.612e+10<br>MPI Messages: 4.782e+06 305.55857 3.053e+05 3.126e+08<br>MPI Message Lengths: 1.018e+10 254.67349 2.106e+03 6.583e+11<br>
MPI Reductions: 2.079e+05 1.00003<br><br>Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)<br> e.g., VecAXPY() for real vectors of length N --> 2N flops<br>
and VecAXPY() for complex vectors of length N --> 8N flops<br><br>Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions --<br> Avg %Total Avg %Total counts %Total Avg %Total counts %Total <br>
0: Main Stage: 1.1615e+03 100.0% 4.1953e+13 100.0% 3.126e+08 100.0% 2.106e+03 100.0% 2.079e+05 100.0% <br><br>------------------------------------------------------------------------------------------------------------------------<br>
See the 'Profiling' chapter of the users' manual for details on interpreting output.<br>Phase summary info:<br> Count: number of times phase was executed<br> Time and Flops: Max - maximum over all processors<br>
Ratio - ratio of maximum to minimum over all processors<br> Mess: number of messages sent<br> Avg. len: average message length<br> Reduct: number of global reductions<br> Global: entire computation<br>
Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().<br> %T - percent time in this phase %f - percent flops in this phase<br> %M - percent messages in this phase %L - percent message lengths in this phase<br>
%R - percent reductions in this phase<br> Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors)<br>------------------------------------------------------------------------------------------------------------------------<br>
Event Count Time (sec) Flops --- Global --- --- Stage --- Total<br> Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %f %M %L %R %T %f %M %L %R Mflop/s<br>
------------------------------------------------------------------------------------------------------------------------<br><br>--- Event Stage 0: Main Stage<br><br>MatMult 102148 1.0 6.4223e+0277.1 3.35e+1014.0 1.3e+08 9.8e+02 0.0e+00 4 12 43 20 0 4 12 43 20 0 7698<br>
MatMultTranspose 2286 1.0 1.7585e+00 4.8 4.07e+08 1.5 7.4e+06 1.1e+03 0.0e+00 0 1 2 1 0 0 1 2 1 0 197783<br>MatSolve 9754141.2 8.8720e+02283.1 5.69e+11199.6 0.0e+00 0.0e+00 0.0e+00 5 76 0 0 0 5 76 0 0 0 35949<br>
MatLUFactorSym 7 1.0 8.4092e-0119.6 0.00e+00 0.0 0.0e+00 0.0e+00 2.1e+01 0 0 0 0 0 0 0 0 0 0 0<br>MatLUFactorNum 28 1.0 1.1228e+0131.9 7.81e+0919.7 0.0e+00 0.0e+00 0.0e+00 0 3 0 0 0 0 3 0 0 0 95551<br>
MatAssemblyBegin 168 1.0 2.3209e+0130.3 0.00e+00 0.0 4.0e+05 3.3e+04 2.8e+02 2 0 0 2 0 2 0 0 2 0 0<br>MatAssemblyEnd 168 1.0 3.5127e+01 1.0 0.00e+00 0.0 7.0e+04 2.7e+02 2.2e+02 3 0 0 0 0 3 0 0 0 0 0<br>
MatGetRowIJ 7 2.3 2.0276e-0215.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0<br>MatGetSubMatrice 28 1.0 1.8989e+00 4.4 0.00e+00 0.0 3.4e+05 3.5e+04 1.1e+02 0 0 0 2 0 0 0 0 2 0 0<br>
MatGetOrdering 7 2.3 4.9773e-0119.7 0.00e+00 0.0 0.0e+00 0.0e+00 1.1e+01 0 0 0 0 0 0 0 0 0 0 0<br>MatIncreaseOvrlp 1 1.0 1.4734e-01 1.4 0.00e+00 0.0 6.9e+04 4.9e+02 8.0e+00 0 0 0 0 0 0 0 0 0 0 0<br>
MatPartitioning 1 1.0 9.0198e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00 0 0 0 0 0 0 0 0 0 0 0<br>MatZeroEntries 70 1.0 1.2433e-01 3.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0<br>
VecDot 25 1.0 1.7762e-02 7.7 1.67e+05 2.7 0.0e+00 0.0e+00 2.5e+01 0 0 0 0 0 0 0 0 0 0 4775<br>VecMDot 95252 1.0 1.0035e+0343.9 1.35e+1017.9 0.0e+00 0.0e+00 9.5e+04 78 4 0 0 46 78 4 0 0 46 1751<br>
VecNorm 97622 1.0 3.8131e+01 2.4 5.13e+0831.3 0.0e+00 0.0e+00 9.8e+04 3 0 0 0 47 3 0 0 0 47 1353<br>VecScale 97567 1.0 3.4131e-0113.9 2.56e+0831.5 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 75343<br>
VecCopy 9260 1.0 4.8895e-0221.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0<br>VecSet 211118 1.0 3.0709e+0072.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0<br>
VecAXPY 6963 1.0 9.5037e-02 3.5 5.61e+07 1.8 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 437314<br>VecWAXPY 2319 1.0 6.7898e-02 2.9 1.20e+07 1.5 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 150007<br>
VecMAXPY 97567 1.0 1.1990e+0128.2 1.40e+1018.1 0.0e+00 0.0e+00 0.0e+00 0 4 0 0 0 0 4 0 0 0 150771<br>VecAssemblyBegin 4648 1.0 2.5907e+01 5.0 0.00e+00 0.0 2.8e+06 8.9e+02 1.4e+04 1 0 1 0 7 1 0 1 0 7 0<br>
VecAssemblyEnd 4648 1.0 9.1485e-0228.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0<br>VecScatterBegin 299601 1.0 1.6790e+0198.2 0.00e+00 0.0 3.1e+08 2.0e+03 0.0e+00 0 0 99 96 0 0 0 99 96 0 0<br>
VecScatterEnd 299601 1.0 6.6906e+02175.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 4 0 0 0 0 4 0 0 0 0 0<br>VecReduceArith 8 1.0 2.5487e-04 3.3 5.83e+04 2.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 144027<br>
VecReduceComm 4 1.0 9.3198e-04 3.1 0.00e+00 0.0 0.0e+00 0.0e+00 4.0e+00 0 0 0 0 0 0 0 0 0 0 0<br>VecNormalize 95269 1.0 3.7148e+01 2.4 7.37e+08745.1 0.0e+00 0.0e+00 9.5e+04 3 0 0 0 46 3 0 0 0 46 1254<br>
SNESSolve 4 1.0 1.1395e+03 1.0 6.28e+1183.1 3.1e+08 2.1e+03 2.1e+05 98100100 99100 98100100 99100 36673<br>SNESLineSearch 25 1.0 3.6445e+00 1.0 1.48e+07 2.7 3.9e+05 9.8e+03 5.4e+02 0 0 0 1 0 0 0 0 1 0 2125<br>
SNESFunctionEval 34 1.0 2.0350e+01 1.0 7.41e+06 2.8 4.1e+05 1.1e+04 5.2e+02 2 0 0 1 0 2 0 0 1 0 182<br>SNESJacobianEval 25 1.0 4.3534e+01 1.0 1.61e+06 1.5 2.7e+05 3.3e+04 2.4e+02 4 0 0 1 0 4 0 0 1 0 31<br>
KSPGMRESOrthog 95252 1.0 1.0043e+0330.8 2.71e+1017.9 0.0e+00 0.0e+00 9.5e+04 78 8 0 0 46 78 8 0 0 46 3499<br>KSPSetup 56 1.0 3.2959e-02 1.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0<br>
KSPSolve 26 1.0 1.0764e+03 1.0 6.28e+1181.7 3.1e+08 2.1e+03 2.1e+05 93100100 98 99 93100100 98 99 38965<br>PCSetUp 74 1.0 1.4828e+0113.4 7.81e+0919.7 4.5e+05 2.6e+04 2.3e+02 0 3 0 2 0 0 3 0 2 0 72355<br>
PCSetUpOnBlocks 2289 1.0 1.1525e+01720.0 7.19e+09781.3 0.0e+00 0.0e+00 2.7e+01 0 1 0 0 0 0 1 0 0 0 29053<br>PCApply 3956 1.0 1.0618e+03 1.0 6.18e+11119.7 3.0e+08 2.1e+03 2.0e+05 89 91 95 93 96 89 91 95 93 96 36057<br>
------------------------------------------------------------------------------------------------------------------------<br><br>Memory usage is given in bytes:<br><br>Object Type Creations Destructions Memory Descendants' Mem.<br>
Reports information only for process 0.<br><br>--- Event Stage 0: Main Stage<br><br> Matrix 48 48 326989216 0<br> Matrix Partitioning 1 1 640 0<br> Index Set 192 192 867864 0<br>
IS L to G Mapping 2 2 59480 0<br> Vector 5781 5781 176334584 0<br> Vector Scatter 31 31 32612 0<br> Application Order 2 2 36714016 0<br>
SNES 4 4 5088 0<br> Krylov Solver 14 14 39362888 0<br> Preconditioner 18 18 16120 0<br> Viewer 1 0 0 0<br>
========================================================================================================================<br>