profiling PETSc code

Matt Funk mafunk at nmsu.edu
Tue Aug 1 17:30:04 CDT 2006


Hi,

well, now i do get  summary:
...
     #                          WARNING!!!                    #
      #                                                        #
      #   This code was run without the PreLoadBegin()         #
      #   macros. To get timing results we always recommend    #
      #   preloading. otherwise timing numbers may be          #
      #   meaningless.                                         #
      ##########################################################


Event                Count      Time (sec)     Flops/sec                         
--- Global ---  --- Stage ---   Total
                   Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg len 
Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------

--- Event Stage 0: Main Stage

VecNorm              200 1.0 5.6217e-03 1.0 2.07e+08 1.0 0.0e+00 0.0e+00 
1.0e+02  0 36  0  0  7   0 36  0  0 31   207
VecCopy              200 1.0 4.2303e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecSet                 1 1.0 8.1062e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecAYPX              100 1.0 3.2036e-03 1.0 1.82e+08 1.0 0.0e+00 0.0e+00 
0.0e+00  0 18  0  0  0   0 18  0  0  0   182
MatMult              100 1.0 8.3530e-03 1.0 3.49e+07 1.0 0.0e+00 0.0e+00 
0.0e+00  1  9  0  0  0   1  9  0  0  0    35
MatSolve             200 1.0 2.5591e-02 1.0 2.28e+07 1.0 0.0e+00 0.0e+00 
0.0e+00  2 18  0  0  0   2 18  0  0  0    23
MatSolveTranspos     100 1.0 2.1357e-02 1.0 1.36e+07 1.0 0.0e+00 0.0e+00 
0.0e+00  2  9  0  0  0   2  9  0  0  0    14
MatLUFactorNum       100 1.0 4.6215e-02 1.0 6.30e+06 1.0 0.0e+00 0.0e+00 
0.0e+00  3  9  0  0  0   3  9  0  0  0     6
MatILUFactorSym        1 1.0 4.4894e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
2.0e+00  0  0  0  0  0   0  0  0  0  1     0
MatAssemblyBegin       1 1.0 2.1458e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatAssemblyEnd       100 1.0 1.1220e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  1  0  0  0  0   1  0  0  0  0     0
MatGetOrdering         1 1.0 2.5296e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
2.0e+00  0  0  0  0  0   0  0  0  0  1     0
KSPSetup             100 1.0 5.3692e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
1.4e+01  0  0  0  0  1   0  0  0  0  4     0
KSPSolve             100 1.0 9.0056e-02 1.0 3.23e+07 1.0 0.0e+00 0.0e+00 
3.0e+02  7 91  0  0 21   7 91  0  0 93    32
PCSetUp              100 1.0 4.9087e-02 1.0 5.93e+06 1.0 0.0e+00 0.0e+00 
4.0e+00  4  9  0  0  0   4  9  0  0  1     6
PCApply              300 1.0 4.9106e-02 1.0 1.78e+07 1.0 0.0e+00 0.0e+00 
0.0e+00  4 27  0  0  0   4 27  0  0  0    18
------------------------------------------------------------------------------------------------------------------------

Memory usage is given in bytes:

Object Type          Creations   Destructions   Memory  Descendants' Mem.

--- Event Stage 0: Main Stage

           Index Set     3              3      35976     0
                 Vec   109            109    2458360     0
              Matrix     2              2      23304     0
       Krylov Solver     1              1          0     0
      Preconditioner     1              1        168     0
========================================================================================================================
Average time to get PetscTime(): 9.53674e-08
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)

...

am i using the push and pop calls in an manner they are not intended to be 
used?

plus, how can i see what's going on with respect to why it takes so much 
longer to solve the system in parallel than in serial without being able to 
specify the stages (i.e single out the KSPSolve call)?


mat





On Tuesday 01 August 2006 15:57, Matthew Knepley wrote:
> ke out your stage push/pop for the moment, and the log_summary
> call. Just run with -log_summary and send the output as a test.
>
>   Thanks,
>
>      Matt




More information about the petsc-users mailing list