profiling PETSc code

Matt Funk mafunk at nmsu.edu
Wed Aug 2 17:21:43 CDT 2006


Hi Matt,

It could be a bad load imbalance because i don't let PETSc decide. I need to 
fix that anyway, so i think i'll try that first and then let you know.
Thanks though for the quick response and helping me to interpret those 
numbers ... 


mat

On Wednesday 02 August 2006 15:50, Matthew Knepley wrote:
> On 8/2/06, Matt Funk <mafunk at nmsu.edu> wrote:
> > Hi Matt,
> >
> > thanks for all the help so far. The -info option is really very helpful.
> > So i think i straightened the actual errors out. However, now i am back
> > to the original question i had. That is why it takes so much longer on 4
> > procs than on 1 proc.
>
> So you have a 1.5 load imbalance for MatMult(), which probably cascades to
> give the 133! load imbalance for VecDot(). You probably have either:
>
>   1) VERY bad laod imbalance
>
>   2) a screwed up network
>
>   3) bad contention on the network (loaded cluster)
>
> Can you help us narrow this down?
>
>
>    Matt
>
> > I profiled the KSPSolve(...) as stage 2:
> >
> > For 1 proc i have:
> > --- Event Stage 2: Stage 2 of ChomboPetscInterface
> >
> > VecDot              4000 1.0 4.9158e-02 1.0 4.74e+08 1.0 0.0e+00 0.0e+00
> > 0.0e+00  0 18  0  0  0   2 18  0  0  0   474
> > VecNorm             8000 1.0 2.1798e-01 1.0 2.14e+08 1.0 0.0e+00 0.0e+00
> > 4.0e+03  1 36  0  0 28   7 36  0  0 33   214
> > VecAYPX             4000 1.0 1.3449e-01 1.0 1.73e+08 1.0 0.0e+00 0.0e+00
> > 0.0e+00  0 18  0  0  0   5 18  0  0  0   173
> > MatMult             4000 1.0 3.6004e-01 1.0 3.24e+07 1.0 0.0e+00 0.0e+00
> > 0.0e+00  1  9  0  0  0  12  9  0  0  0    32
> > MatSolve            8000 1.0 1.0620e+00 1.0 2.19e+07 1.0 0.0e+00 0.0e+00
> > 0.0e+00  3 18  0  0  0  36 18  0  0  0    22
> > KSPSolve            4000 1.0 2.8338e+00 1.0 4.52e+07 1.0 0.0e+00 0.0e+00
> > 1.2e+04  7100  0  0 84  97100  0  0100    45
> > PCApply             8000 1.0 1.1133e+00 1.0 2.09e+07 1.0 0.0e+00 0.0e+00
> > 0.0e+00  3 18  0  0  0  38 18  0  0  0    21
> >
> >
> > for 4 procs i have :
> > --- Event Stage 2: Stage 2 of ChomboPetscInterface
> >
> > VecDot              4000 1.0 3.5884e+01133.7 2.17e+07133.7 0.0e+00
> > 0.0e+00 4.0e+03  8 18  0  0  5   9 18  0  0 14     1
> > VecNorm             8000 1.0 3.4986e-01 1.3 4.43e+07 1.3 0.0e+00 0.0e+00
> > 8.0e+03  0 36  0  0 10   0 36  0  0 29   133
> > VecSet              8000 1.0 3.5024e-02 1.4 0.00e+00 0.0 0.0e+00 0.0e+00
> > 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > VecAYPX             4000 1.0 5.6790e-02 1.3 1.28e+08 1.3 0.0e+00 0.0e+00
> > 0.0e+00  0 18  0  0  0   0 18  0  0  0   410
> > VecScatterBegin     4000 1.0 6.0042e+01 1.4 0.00e+00 0.0 0.0e+00 0.0e+00
> > 0.0e+00 38  0  0  0  0  45  0  0  0  0     0
> > VecScatterEnd       4000 1.0 5.9364e+01 1.4 0.00e+00 0.0 0.0e+00 0.0e+00
> > 0.0e+00 37  0  0  0  0  44  0  0  0  0     0
> > MatMult             4000 1.0 1.1959e+02 1.4 3.46e+04 1.4 0.0e+00 0.0e+00
> > 0.0e+00 75  9  0  0  0  89  9  0  0  0     0
> > MatSolve            8000 1.0 2.8150e-01 1.0 2.16e+07 1.0 0.0e+00 0.0e+00
> > 0.0e+00  0 18  0  0  0   0 18  0  0  0    83
> > MatLUFactorNum         1 1.0 1.3685e-04 1.1 5.64e+06 1.1 0.0e+00 0.0e+00
> > 0.0e+00  0  0  0  0  0   0  0  0  0  0    21
> > MatILUFactorSym        1 1.0 2.3389e-04 1.2 0.00e+00 0.0 0.0e+00 0.0e+00
> > 2.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > MatGetOrdering         1 1.0 9.6083e-05 1.2 0.00e+00 0.0 0.0e+00 0.0e+00
> > 2.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > KSPSetup               1 1.0 2.1458e-06 2.2 0.00e+00 0.0 0.0e+00 0.0e+00
> > 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > KSPSolve            4000 1.0 1.2200e+02 1.0 2.63e+05 1.0 0.0e+00 0.0e+00
> > 2.8e+04 84100  0  0 34 100100  0  0100     1
> > PCSetUp                1 1.0 5.0187e-04 1.2 1.68e+06 1.2 0.0e+00 0.0e+00
> > 4.0e+00  0  0  0  0  0   0  0  0  0  0     6
> > PCSetUpOnBlocks     4000 1.0 1.2104e-02 2.2 1.34e+05 2.2 0.0e+00 0.0e+00
> > 4.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > PCApply             8000 1.0 8.4254e-01 1.2 8.27e+06 1.2 0.0e+00 0.0e+00
> > 8.0e+03  1 18  0  0 10   1 18  0  0 29    28
> > -------------------------------------------------------------------------
> >-----------------------------------------------
> >
> > Now if i understand it right, all these calls summarize all calls between
> > the pop and push commands. That would mean that the majority of the time
> > is spend in the MatMult and in within that the VecScatterBegin and
> > VecScatterEnd commands (if i understand it right).
> >
> > My problem size is really small. So i was wondering if the problem lies
> > in that (namely that the major time is simply spend communicating between
> > processors, or whether there is still something wrong with how i wrote
> > the code?)
> >
> >
> > thanks
> > mat
> >
> > On Tuesday 01 August 2006 18:28, Matthew Knepley wrote:
> > > On 8/1/06, Matt Funk <mafunk at nmsu.edu> wrote:
> > > > Actually the errors occur on my calls to a PETSc functions after
> > > > calling PETSCInitialize.
> > >
> > > Yes, it is the error I pointed out in the last message.
> > >
> > >    Matt
> > >
> > > > mat




More information about the petsc-users mailing list