[petsc-users] KSPGMRESOrthog costs too much time
Rongliang Chen
rongliang.chan at gmail.com
Thu Nov 17 11:17:11 CST 2011
Hi All,
In my log_summary output, I found that nearly 80% of the total time is
spent on KSPGMRESOrthog. I think this does not make sense ( the log_summary
output followed). Who has any idea about this?
Another question, I am using the two-level asm precondtioner. On the
coarse level I use one-level asm preconditioned GMRES to solve a coarse
problem. So both the fine level solver and coarse level solver call the
function KSPGMRESOrthog. In the log_summary output, I just know the total
time spent on KSPGMRESOrthog and how can I know how much time is spent on
the coarse level KSPGMRESOrthog and how much is spent on fine level
KSPGMRESOrthog? Thanks.
Best,
Rongliang
------------------------------------------------------------------------------------------------------------------------
************************************************************************************************************************
*** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r
-fCourier9' to print this document ***
************************************************************************************************************************
---------------------------------------------- PETSc Performance Summary:
----------------------------------------------
./joab on a Janus-nod named node1777 with 1024 processors, by ronglian Thu
Nov 17 00:32:04 2011
Using Petsc Release Version 3.2.0, Patch 4, Sun Oct 23 12:23:18 CDT 2011
Max Max/Min Avg Total
Time (sec): 1.162e+03 1.00001 1.162e+03
Objects: 6.094e+03 1.00099 6.090e+03
Flops: 6.284e+11 81.61246 4.097e+10 4.195e+13
Flops/sec: 5.410e+08 81.61201 3.527e+07 3.612e+10
MPI Messages: 4.782e+06 305.55857 3.053e+05 3.126e+08
MPI Message Lengths: 1.018e+10 254.67349 2.106e+03 6.583e+11
MPI Reductions: 2.079e+05 1.00003
Flop counting convention: 1 flop = 1 real number operation of type
(multiply/divide/add/subtract)
e.g., VecAXPY() for real vectors of length N
--> 2N flops
and VecAXPY() for complex vectors of length N
--> 8N flops
Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages
--- -- Message Lengths -- -- Reductions --
Avg %Total Avg %Total counts
%Total Avg %Total counts %Total
0: Main Stage: 1.1615e+03 100.0% 4.1953e+13 100.0% 3.126e+08
100.0% 2.106e+03 100.0% 2.079e+05 100.0%
------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on
interpreting output.
Phase summary info:
Count: number of times phase was executed
Time and Flops: Max - maximum over all processors
Ratio - ratio of maximum to minimum over all processors
Mess: number of messages sent
Avg. len: average message length
Reduct: number of global reductions
Global: entire computation
Stage: stages of a computation. Set stages with PetscLogStagePush() and
PetscLogStagePop().
%T - percent time in this phase %f - percent flops in this
phase
%M - percent messages in this phase %L - percent message lengths
in this phase
%R - percent reductions in this phase
Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over
all processors)
------------------------------------------------------------------------------------------------------------------------
Event Count Time (sec)
Flops --- Global --- --- Stage --- Total
Max Ratio Max Ratio Max Ratio Mess Avg len
Reduct %T %f %M %L %R %T %f %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------
--- Event Stage 0: Main Stage
MatMult 102148 1.0 6.4223e+0277.1 3.35e+1014.0 1.3e+08 9.8e+02
0.0e+00 4 12 43 20 0 4 12 43 20 0 7698
MatMultTranspose 2286 1.0 1.7585e+00 4.8 4.07e+08 1.5 7.4e+06 1.1e+03
0.0e+00 0 1 2 1 0 0 1 2 1 0 197783
MatSolve 9754141.2 8.8720e+02283.1 5.69e+11199.6 0.0e+00 0.0e+00
0.0e+00 5 76 0 0 0 5 76 0 0 0 35949
MatLUFactorSym 7 1.0 8.4092e-0119.6 0.00e+00 0.0 0.0e+00 0.0e+00
2.1e+01 0 0 0 0 0 0 0 0 0 0 0
MatLUFactorNum 28 1.0 1.1228e+0131.9 7.81e+0919.7 0.0e+00 0.0e+00
0.0e+00 0 3 0 0 0 0 3 0 0 0 95551
MatAssemblyBegin 168 1.0 2.3209e+0130.3 0.00e+00 0.0 4.0e+05 3.3e+04
2.8e+02 2 0 0 2 0 2 0 0 2 0 0
MatAssemblyEnd 168 1.0 3.5127e+01 1.0 0.00e+00 0.0 7.0e+04 2.7e+02
2.2e+02 3 0 0 0 0 3 0 0 0 0 0
MatGetRowIJ 7 2.3 2.0276e-0215.7 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatGetSubMatrice 28 1.0 1.8989e+00 4.4 0.00e+00 0.0 3.4e+05 3.5e+04
1.1e+02 0 0 0 2 0 0 0 0 2 0 0
MatGetOrdering 7 2.3 4.9773e-0119.7 0.00e+00 0.0 0.0e+00 0.0e+00
1.1e+01 0 0 0 0 0 0 0 0 0 0 0
MatIncreaseOvrlp 1 1.0 1.4734e-01 1.4 0.00e+00 0.0 6.9e+04 4.9e+02
8.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatPartitioning 1 1.0 9.0198e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
1.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatZeroEntries 70 1.0 1.2433e-01 3.5 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecDot 25 1.0 1.7762e-02 7.7 1.67e+05 2.7 0.0e+00 0.0e+00
2.5e+01 0 0 0 0 0 0 0 0 0 0 4775
VecMDot 95252 1.0 1.0035e+0343.9 1.35e+1017.9 0.0e+00 0.0e+00
9.5e+04 78 4 0 0 46 78 4 0 0 46 1751
VecNorm 97622 1.0 3.8131e+01 2.4 5.13e+0831.3 0.0e+00 0.0e+00
9.8e+04 3 0 0 0 47 3 0 0 0 47 1353
VecScale 97567 1.0 3.4131e-0113.9 2.56e+0831.5 0.0e+00 0.0e+00
0.0e+00 0 0 0 0 0 0 0 0 0 0 75343
VecCopy 9260 1.0 4.8895e-0221.3 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecSet 211118 1.0 3.0709e+0072.3 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecAXPY 6963 1.0 9.5037e-02 3.5 5.61e+07 1.8 0.0e+00 0.0e+00
0.0e+00 0 0 0 0 0 0 0 0 0 0 437314
VecWAXPY 2319 1.0 6.7898e-02 2.9 1.20e+07 1.5 0.0e+00 0.0e+00
0.0e+00 0 0 0 0 0 0 0 0 0 0 150007
VecMAXPY 97567 1.0 1.1990e+0128.2 1.40e+1018.1 0.0e+00 0.0e+00
0.0e+00 0 4 0 0 0 0 4 0 0 0 150771
VecAssemblyBegin 4648 1.0 2.5907e+01 5.0 0.00e+00 0.0 2.8e+06 8.9e+02
1.4e+04 1 0 1 0 7 1 0 1 0 7 0
VecAssemblyEnd 4648 1.0 9.1485e-0228.2 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecScatterBegin 299601 1.0 1.6790e+0198.2 0.00e+00 0.0 3.1e+08 2.0e+03
0.0e+00 0 0 99 96 0 0 0 99 96 0 0
VecScatterEnd 299601 1.0 6.6906e+02175.9 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00 4 0 0 0 0 4 0 0 0 0 0
VecReduceArith 8 1.0 2.5487e-04 3.3 5.83e+04 2.1 0.0e+00 0.0e+00
0.0e+00 0 0 0 0 0 0 0 0 0 0 144027
VecReduceComm 4 1.0 9.3198e-04 3.1 0.00e+00 0.0 0.0e+00 0.0e+00
4.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecNormalize 95269 1.0 3.7148e+01 2.4 7.37e+08745.1 0.0e+00 0.0e+00
9.5e+04 3 0 0 0 46 3 0 0 0 46 1254
SNESSolve 4 1.0 1.1395e+03 1.0 6.28e+1183.1 3.1e+08 2.1e+03
2.1e+05 98100100 99100 98100100 99100 36673
SNESLineSearch 25 1.0 3.6445e+00 1.0 1.48e+07 2.7 3.9e+05 9.8e+03
5.4e+02 0 0 0 1 0 0 0 0 1 0 2125
SNESFunctionEval 34 1.0 2.0350e+01 1.0 7.41e+06 2.8 4.1e+05 1.1e+04
5.2e+02 2 0 0 1 0 2 0 0 1 0 182
SNESJacobianEval 25 1.0 4.3534e+01 1.0 1.61e+06 1.5 2.7e+05 3.3e+04
2.4e+02 4 0 0 1 0 4 0 0 1 0 31
KSPGMRESOrthog 95252 1.0 1.0043e+0330.8 2.71e+1017.9 0.0e+00 0.0e+00
9.5e+04 78 8 0 0 46 78 8 0 0 46 3499
KSPSetup 56 1.0 3.2959e-02 1.9 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00 0 0 0 0 0 0 0 0 0 0 0
KSPSolve 26 1.0 1.0764e+03 1.0 6.28e+1181.7 3.1e+08 2.1e+03
2.1e+05 93100100 98 99 93100100 98 99 38965
PCSetUp 74 1.0 1.4828e+0113.4 7.81e+0919.7 4.5e+05 2.6e+04
2.3e+02 0 3 0 2 0 0 3 0 2 0 72355
PCSetUpOnBlocks 2289 1.0 1.1525e+01720.0 7.19e+09781.3 0.0e+00 0.0e+00
2.7e+01 0 1 0 0 0 0 1 0 0 0 29053
PCApply 3956 1.0 1.0618e+03 1.0 6.18e+11119.7 3.0e+08 2.1e+03
2.0e+05 89 91 95 93 96 89 91 95 93 96 36057
------------------------------------------------------------------------------------------------------------------------
Memory usage is given in bytes:
Object Type Creations Destructions Memory Descendants' Mem.
Reports information only for process 0.
--- Event Stage 0: Main Stage
Matrix 48 48 326989216 0
Matrix Partitioning 1 1 640 0
Index Set 192 192 867864 0
IS L to G Mapping 2 2 59480 0
Vector 5781 5781 176334584 0
Vector Scatter 31 31 32612 0
Application Order 2 2 36714016 0
SNES 4 4 5088 0
Krylov Solver 14 14 39362888 0
Preconditioner 18 18 16120 0
Viewer 1 0 0 0
========================================================================================================================
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20111117/c69db50c/attachment.htm>
More information about the petsc-users
mailing list