[petsc-users] KSPGMRESOrthog costs too much time

Rongliang Chen rongliang.chan at gmail.com
Thu Nov 17 11:17:11 CST 2011


Hi All,

In my log_summary output, I found that nearly 80% of the total time is
spent on KSPGMRESOrthog. I think this does not make sense ( the log_summary
output followed). Who has any idea about this?

Another question, I am using the two-level asm precondtioner.  On the
coarse level I use one-level asm preconditioned GMRES to solve a coarse
problem. So both the fine level solver and coarse level solver call the
function KSPGMRESOrthog. In the log_summary output, I just know the total
time spent on KSPGMRESOrthog and how can I know how much time is spent on
the coarse level KSPGMRESOrthog and how much is spent on fine level
KSPGMRESOrthog? Thanks.

Best,
Rongliang


------------------------------------------------------------------------------------------------------------------------
************************************************************************************************************************
***             WIDEN YOUR WINDOW TO 120 CHARACTERS.  Use 'enscript -r
-fCourier9' to print this document            ***
************************************************************************************************************************

---------------------------------------------- PETSc Performance Summary:
----------------------------------------------

./joab on a Janus-nod named node1777 with 1024 processors, by ronglian Thu
Nov 17 00:32:04 2011
Using Petsc Release Version 3.2.0, Patch 4, Sun Oct 23 12:23:18 CDT 2011

                         Max       Max/Min        Avg      Total
Time (sec):           1.162e+03      1.00001   1.162e+03
Objects:              6.094e+03      1.00099   6.090e+03
Flops:                6.284e+11     81.61246   4.097e+10  4.195e+13
Flops/sec:            5.410e+08     81.61201   3.527e+07  3.612e+10
MPI Messages:         4.782e+06    305.55857   3.053e+05  3.126e+08
MPI Message Lengths:  1.018e+10    254.67349   2.106e+03  6.583e+11
MPI Reductions:       2.079e+05      1.00003

Flop counting convention: 1 flop = 1 real number operation of type
(multiply/divide/add/subtract)
                            e.g., VecAXPY() for real vectors of length N
--> 2N flops
                            and VecAXPY() for complex vectors of length N
--> 8N flops

Summary of Stages:   ----- Time ------  ----- Flops -----  --- Messages
---  -- Message Lengths --  -- Reductions --
                        Avg     %Total     Avg     %Total   counts
%Total     Avg         %Total   counts   %Total
 0:      Main Stage: 1.1615e+03 100.0%  4.1953e+13 100.0%  3.126e+08
100.0%  2.106e+03      100.0%  2.079e+05 100.0%

------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on
interpreting output.
Phase summary info:
   Count: number of times phase was executed
   Time and Flops: Max - maximum over all processors
                   Ratio - ratio of maximum to minimum over all processors
   Mess: number of messages sent
   Avg. len: average message length
   Reduct: number of global reductions
   Global: entire computation
   Stage: stages of a computation. Set stages with PetscLogStagePush() and
PetscLogStagePop().
      %T - percent time in this phase         %f - percent flops in this
phase
      %M - percent messages in this phase     %L - percent message lengths
in this phase
      %R - percent reductions in this phase
   Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over
all processors)
------------------------------------------------------------------------------------------------------------------------
Event                Count      Time (sec)
Flops                             --- Global ---  --- Stage ---   Total
                   Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg len
Reduct  %T %f %M %L %R  %T %f %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------

--- Event Stage 0: Main Stage

MatMult           102148 1.0 6.4223e+0277.1 3.35e+1014.0 1.3e+08 9.8e+02
0.0e+00  4 12 43 20  0   4 12 43 20  0  7698
MatMultTranspose    2286 1.0 1.7585e+00 4.8 4.07e+08 1.5 7.4e+06 1.1e+03
0.0e+00  0  1  2  1  0   0  1  2  1  0 197783
MatSolve           9754141.2 8.8720e+02283.1 5.69e+11199.6 0.0e+00 0.0e+00
0.0e+00  5 76  0  0  0   5 76  0  0  0 35949
MatLUFactorSym         7 1.0 8.4092e-0119.6 0.00e+00 0.0 0.0e+00 0.0e+00
2.1e+01  0  0  0  0  0   0  0  0  0  0     0
MatLUFactorNum        28 1.0 1.1228e+0131.9 7.81e+0919.7 0.0e+00 0.0e+00
0.0e+00  0  3  0  0  0   0  3  0  0  0 95551
MatAssemblyBegin     168 1.0 2.3209e+0130.3 0.00e+00 0.0 4.0e+05 3.3e+04
2.8e+02  2  0  0  2  0   2  0  0  2  0     0
MatAssemblyEnd       168 1.0 3.5127e+01 1.0 0.00e+00 0.0 7.0e+04 2.7e+02
2.2e+02  3  0  0  0  0   3  0  0  0  0     0
MatGetRowIJ            7 2.3 2.0276e-0215.7 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetSubMatrice      28 1.0 1.8989e+00 4.4 0.00e+00 0.0 3.4e+05 3.5e+04
1.1e+02  0  0  0  2  0   0  0  0  2  0     0
MatGetOrdering         7 2.3 4.9773e-0119.7 0.00e+00 0.0 0.0e+00 0.0e+00
1.1e+01  0  0  0  0  0   0  0  0  0  0     0
MatIncreaseOvrlp       1 1.0 1.4734e-01 1.4 0.00e+00 0.0 6.9e+04 4.9e+02
8.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatPartitioning        1 1.0 9.0198e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
1.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatZeroEntries        70 1.0 1.2433e-01 3.5 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecDot                25 1.0 1.7762e-02 7.7 1.67e+05 2.7 0.0e+00 0.0e+00
2.5e+01  0  0  0  0  0   0  0  0  0  0  4775
VecMDot            95252 1.0 1.0035e+0343.9 1.35e+1017.9 0.0e+00 0.0e+00
9.5e+04 78  4  0  0 46  78  4  0  0 46  1751
VecNorm            97622 1.0 3.8131e+01 2.4 5.13e+0831.3 0.0e+00 0.0e+00
9.8e+04  3  0  0  0 47   3  0  0  0 47  1353
VecScale           97567 1.0 3.4131e-0113.9 2.56e+0831.5 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0 75343
VecCopy             9260 1.0 4.8895e-0221.3 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecSet            211118 1.0 3.0709e+0072.3 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecAXPY             6963 1.0 9.5037e-02 3.5 5.61e+07 1.8 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0 437314
VecWAXPY            2319 1.0 6.7898e-02 2.9 1.20e+07 1.5 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0 150007
VecMAXPY           97567 1.0 1.1990e+0128.2 1.40e+1018.1 0.0e+00 0.0e+00
0.0e+00  0  4  0  0  0   0  4  0  0  0 150771
VecAssemblyBegin    4648 1.0 2.5907e+01 5.0 0.00e+00 0.0 2.8e+06 8.9e+02
1.4e+04  1  0  1  0  7   1  0  1  0  7     0
VecAssemblyEnd      4648 1.0 9.1485e-0228.2 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecScatterBegin   299601 1.0 1.6790e+0198.2 0.00e+00 0.0 3.1e+08 2.0e+03
0.0e+00  0  0 99 96  0   0  0 99 96  0     0
VecScatterEnd     299601 1.0 6.6906e+02175.9 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  4  0  0  0  0   4  0  0  0  0     0
VecReduceArith         8 1.0 2.5487e-04 3.3 5.83e+04 2.1 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0 144027
VecReduceComm          4 1.0 9.3198e-04 3.1 0.00e+00 0.0 0.0e+00 0.0e+00
4.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecNormalize       95269 1.0 3.7148e+01 2.4 7.37e+08745.1 0.0e+00 0.0e+00
9.5e+04  3  0  0  0 46   3  0  0  0 46  1254
SNESSolve              4 1.0 1.1395e+03 1.0 6.28e+1183.1 3.1e+08 2.1e+03
2.1e+05 98100100 99100  98100100 99100 36673
SNESLineSearch        25 1.0 3.6445e+00 1.0 1.48e+07 2.7 3.9e+05 9.8e+03
5.4e+02  0  0  0  1  0   0  0  0  1  0  2125
SNESFunctionEval      34 1.0 2.0350e+01 1.0 7.41e+06 2.8 4.1e+05 1.1e+04
5.2e+02  2  0  0  1  0   2  0  0  1  0   182
SNESJacobianEval      25 1.0 4.3534e+01 1.0 1.61e+06 1.5 2.7e+05 3.3e+04
2.4e+02  4  0  0  1  0   4  0  0  1  0    31
KSPGMRESOrthog     95252 1.0 1.0043e+0330.8 2.71e+1017.9 0.0e+00 0.0e+00
9.5e+04 78  8  0  0 46  78  8  0  0 46  3499
KSPSetup              56 1.0 3.2959e-02 1.9 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
KSPSolve              26 1.0 1.0764e+03 1.0 6.28e+1181.7 3.1e+08 2.1e+03
2.1e+05 93100100 98 99  93100100 98 99 38965
PCSetUp               74 1.0 1.4828e+0113.4 7.81e+0919.7 4.5e+05 2.6e+04
2.3e+02  0  3  0  2  0   0  3  0  2  0 72355
PCSetUpOnBlocks     2289 1.0 1.1525e+01720.0 7.19e+09781.3 0.0e+00 0.0e+00
2.7e+01  0  1  0  0  0   0  1  0  0  0 29053
PCApply             3956 1.0 1.0618e+03 1.0 6.18e+11119.7 3.0e+08 2.1e+03
2.0e+05 89 91 95 93 96  89 91 95 93 96 36057
------------------------------------------------------------------------------------------------------------------------

Memory usage is given in bytes:

Object Type          Creations   Destructions     Memory  Descendants' Mem.
Reports information only for process 0.

--- Event Stage 0: Main Stage

              Matrix    48             48    326989216     0
 Matrix Partitioning     1              1          640     0
           Index Set   192            192       867864     0
   IS L to G Mapping     2              2        59480     0
              Vector  5781           5781    176334584     0
      Vector Scatter    31             31        32612     0
   Application Order     2              2     36714016     0
                SNES     4              4         5088     0
       Krylov Solver    14             14     39362888     0
      Preconditioner    18             18        16120     0
              Viewer     1              0            0     0
========================================================================================================================
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20111117/c69db50c/attachment.htm>


More information about the petsc-users mailing list