profiling PETSc code
Matt Funk
mafunk at nmsu.edu
Wed Aug 2 16:32:20 CDT 2006
Hi Matt,
thanks for all the help so far. The -info option is really very helpful. So i
think i straightened the actual errors out. However, now i am back to the
original question i had. That is why it takes so much longer on 4 procs than
on 1 proc.
I profiled the KSPSolve(...) as stage 2:
For 1 proc i have:
--- Event Stage 2: Stage 2 of ChomboPetscInterface
VecDot 4000 1.0 4.9158e-02 1.0 4.74e+08 1.0 0.0e+00 0.0e+00
0.0e+00 0 18 0 0 0 2 18 0 0 0 474
VecNorm 8000 1.0 2.1798e-01 1.0 2.14e+08 1.0 0.0e+00 0.0e+00
4.0e+03 1 36 0 0 28 7 36 0 0 33 214
VecAYPX 4000 1.0 1.3449e-01 1.0 1.73e+08 1.0 0.0e+00 0.0e+00
0.0e+00 0 18 0 0 0 5 18 0 0 0 173
MatMult 4000 1.0 3.6004e-01 1.0 3.24e+07 1.0 0.0e+00 0.0e+00
0.0e+00 1 9 0 0 0 12 9 0 0 0 32
MatSolve 8000 1.0 1.0620e+00 1.0 2.19e+07 1.0 0.0e+00 0.0e+00
0.0e+00 3 18 0 0 0 36 18 0 0 0 22
KSPSolve 4000 1.0 2.8338e+00 1.0 4.52e+07 1.0 0.0e+00 0.0e+00
1.2e+04 7100 0 0 84 97100 0 0100 45
PCApply 8000 1.0 1.1133e+00 1.0 2.09e+07 1.0 0.0e+00 0.0e+00
0.0e+00 3 18 0 0 0 38 18 0 0 0 21
for 4 procs i have :
--- Event Stage 2: Stage 2 of ChomboPetscInterface
VecDot 4000 1.0 3.5884e+01133.7 2.17e+07133.7 0.0e+00 0.0e+00
4.0e+03 8 18 0 0 5 9 18 0 0 14 1
VecNorm 8000 1.0 3.4986e-01 1.3 4.43e+07 1.3 0.0e+00 0.0e+00
8.0e+03 0 36 0 0 10 0 36 0 0 29 133
VecSet 8000 1.0 3.5024e-02 1.4 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecAYPX 4000 1.0 5.6790e-02 1.3 1.28e+08 1.3 0.0e+00 0.0e+00
0.0e+00 0 18 0 0 0 0 18 0 0 0 410
VecScatterBegin 4000 1.0 6.0042e+01 1.4 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00 38 0 0 0 0 45 0 0 0 0 0
VecScatterEnd 4000 1.0 5.9364e+01 1.4 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00 37 0 0 0 0 44 0 0 0 0 0
MatMult 4000 1.0 1.1959e+02 1.4 3.46e+04 1.4 0.0e+00 0.0e+00
0.0e+00 75 9 0 0 0 89 9 0 0 0 0
MatSolve 8000 1.0 2.8150e-01 1.0 2.16e+07 1.0 0.0e+00 0.0e+00
0.0e+00 0 18 0 0 0 0 18 0 0 0 83
MatLUFactorNum 1 1.0 1.3685e-04 1.1 5.64e+06 1.1 0.0e+00 0.0e+00
0.0e+00 0 0 0 0 0 0 0 0 0 0 21
MatILUFactorSym 1 1.0 2.3389e-04 1.2 0.00e+00 0.0 0.0e+00 0.0e+00
2.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatGetOrdering 1 1.0 9.6083e-05 1.2 0.00e+00 0.0 0.0e+00 0.0e+00
2.0e+00 0 0 0 0 0 0 0 0 0 0 0
KSPSetup 1 1.0 2.1458e-06 2.2 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00 0 0 0 0 0 0 0 0 0 0 0
KSPSolve 4000 1.0 1.2200e+02 1.0 2.63e+05 1.0 0.0e+00 0.0e+00
2.8e+04 84100 0 0 34 100100 0 0100 1
PCSetUp 1 1.0 5.0187e-04 1.2 1.68e+06 1.2 0.0e+00 0.0e+00
4.0e+00 0 0 0 0 0 0 0 0 0 0 6
PCSetUpOnBlocks 4000 1.0 1.2104e-02 2.2 1.34e+05 2.2 0.0e+00 0.0e+00
4.0e+00 0 0 0 0 0 0 0 0 0 0 0
PCApply 8000 1.0 8.4254e-01 1.2 8.27e+06 1.2 0.0e+00 0.0e+00
8.0e+03 1 18 0 0 10 1 18 0 0 29 28
------------------------------------------------------------------------------------------------------------------------
Now if i understand it right, all these calls summarize all calls between the
pop and push commands. That would mean that the majority of the time is spend
in the MatMult and in within that the VecScatterBegin and VecScatterEnd
commands (if i understand it right).
My problem size is really small. So i was wondering if the problem lies in
that (namely that the major time is simply spend communicating between
processors, or whether there is still something wrong with how i wrote the
code?)
thanks
mat
On Tuesday 01 August 2006 18:28, Matthew Knepley wrote:
> On 8/1/06, Matt Funk <mafunk at nmsu.edu> wrote:
> > Actually the errors occur on my calls to a PETSc functions after calling
> > PETSCInitialize.
>
> Yes, it is the error I pointed out in the last message.
>
> Matt
>
> > mat
More information about the petsc-users
mailing list