profiling PETSc code
Matt Funk
mafunk at nmsu.edu
Tue Aug 1 17:30:04 CDT 2006
Hi,
well, now i do get summary:
...
# WARNING!!! #
# #
# This code was run without the PreLoadBegin() #
# macros. To get timing results we always recommend #
# preloading. otherwise timing numbers may be #
# meaningless. #
##########################################################
Event Count Time (sec) Flops/sec
--- Global --- --- Stage --- Total
Max Ratio Max Ratio Max Ratio Mess Avg len
Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------
--- Event Stage 0: Main Stage
VecNorm 200 1.0 5.6217e-03 1.0 2.07e+08 1.0 0.0e+00 0.0e+00
1.0e+02 0 36 0 0 7 0 36 0 0 31 207
VecCopy 200 1.0 4.2303e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecSet 1 1.0 8.1062e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecAYPX 100 1.0 3.2036e-03 1.0 1.82e+08 1.0 0.0e+00 0.0e+00
0.0e+00 0 18 0 0 0 0 18 0 0 0 182
MatMult 100 1.0 8.3530e-03 1.0 3.49e+07 1.0 0.0e+00 0.0e+00
0.0e+00 1 9 0 0 0 1 9 0 0 0 35
MatSolve 200 1.0 2.5591e-02 1.0 2.28e+07 1.0 0.0e+00 0.0e+00
0.0e+00 2 18 0 0 0 2 18 0 0 0 23
MatSolveTranspos 100 1.0 2.1357e-02 1.0 1.36e+07 1.0 0.0e+00 0.0e+00
0.0e+00 2 9 0 0 0 2 9 0 0 0 14
MatLUFactorNum 100 1.0 4.6215e-02 1.0 6.30e+06 1.0 0.0e+00 0.0e+00
0.0e+00 3 9 0 0 0 3 9 0 0 0 6
MatILUFactorSym 1 1.0 4.4894e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
2.0e+00 0 0 0 0 0 0 0 0 0 1 0
MatAssemblyBegin 1 1.0 2.1458e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatAssemblyEnd 100 1.0 1.1220e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00 1 0 0 0 0 1 0 0 0 0 0
MatGetOrdering 1 1.0 2.5296e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
2.0e+00 0 0 0 0 0 0 0 0 0 1 0
KSPSetup 100 1.0 5.3692e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
1.4e+01 0 0 0 0 1 0 0 0 0 4 0
KSPSolve 100 1.0 9.0056e-02 1.0 3.23e+07 1.0 0.0e+00 0.0e+00
3.0e+02 7 91 0 0 21 7 91 0 0 93 32
PCSetUp 100 1.0 4.9087e-02 1.0 5.93e+06 1.0 0.0e+00 0.0e+00
4.0e+00 4 9 0 0 0 4 9 0 0 1 6
PCApply 300 1.0 4.9106e-02 1.0 1.78e+07 1.0 0.0e+00 0.0e+00
0.0e+00 4 27 0 0 0 4 27 0 0 0 18
------------------------------------------------------------------------------------------------------------------------
Memory usage is given in bytes:
Object Type Creations Destructions Memory Descendants' Mem.
--- Event Stage 0: Main Stage
Index Set 3 3 35976 0
Vec 109 109 2458360 0
Matrix 2 2 23304 0
Krylov Solver 1 1 0 0
Preconditioner 1 1 168 0
========================================================================================================================
Average time to get PetscTime(): 9.53674e-08
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
...
am i using the push and pop calls in an manner they are not intended to be
used?
plus, how can i see what's going on with respect to why it takes so much
longer to solve the system in parallel than in serial without being able to
specify the stages (i.e single out the KSPSolve call)?
mat
On Tuesday 01 August 2006 15:57, Matthew Knepley wrote:
> ke out your stage push/pop for the moment, and the log_summary
> call. Just run with -log_summary and send the output as a test.
>
> Thanks,
>
> Matt
More information about the petsc-users
mailing list