[petsc-users] Suspect Poor Performance of Petsc
Ivano Barletta
ibarletta at inogs.it
Thu Nov 17 12:25:46 CST 2016
Thank you for your replies
I've carried out 3 new tests with 4,8,16 cores, adding the
code lines suggested by Barry (logs in attachment )
Still lack of scaling persist, but maybe is just related to the size
of the problem.
As far as balancing concerns, there's not very much I can do, I
believe it depends on land/sea points, (on land matrix coefficients
are zero), it's something I cannot control.
By the way, I mean to carry some test on higher resolution
configurations, with a bigger problem size
Kind Regards
Ivano
2016-11-17 17:24 GMT+01:00 Barry Smith <bsmith at mcs.anl.gov>:
>
> Ivano,
>
> I have cut and pasted the relevant parts of the logs below and removed a
> few irrelevent lines to make the analysis simplier.
>
> There is a lot of bad stuff going on that is hurting performance.
>
> 1) The percentage of the time in the linear solve is getting lower going
> from 79% with 4 processes to 54 % with 16 processes. This means the rest of
> the code is not scaling well, likely that is generating the matrix. How are
> you getting the matrix into the program? If you are reading it as ASCII
> (somehow in parallel?) you should not do that. You should use MatLoad() to
> get the matrix in efficiently (see for example src/ksp/ksp/examples/
> tutorials/ex10.c).
>
> To make the analysis better you need to add around the KSP solve.
>
> ierr = PetscLogStageRegister("Solve", &stage);CHKERRQ(ierr);
> ierr = PetscLogStagePush(stage);CHKERRQ(ierr);
> ierr = KSPSolve(ksp,bb,xx);CHKERRQ(ierr);
> ierr = PetscLogStagePop();CHKERRQ(ierr);
>
> and rerun the three cases.
>
> 2) The load balance is bad even for four processes. For example it is 1.3
> in the MatSolve, it should be really close to 1.0. How are you dividing the
> matrix up between processes?
>
> 3) It is spending a HUGE amount of time in VecNorm(), 26 % on 4 processes
> and 42% on 16 processes. This could be partially or completely due to load
> balancing but might have other issues.
>
> Run with -ksp_norm_type natural in your new sets of runs
>
> Also always run with -ksp_type cg ; it makes no sense to use gmres or the
> other KSP methods.
>
> Eagerly awaiting your response.
>
> Barry
>
>
>
> ------------------------------------------------------------
> ------------------------------------------------------------
> Event Count Time (sec) Flops
> --- Global --- --- Stage --- Total
> Max Ratio Max Ratio Max Ratio Mess Avg len
> Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s
> ------------------------------------------------------------
> ------------------------------------------------------------
>
> MatMult 75 1.0 4.1466e-03 1.2 3.70e+06 1.6 6.0e+02 4.4e+02
> 0.0e+00 13 25 97 99 0 13 25 97 99 0 2868
> MatSolve 75 1.0 6.1995e-03 1.3 3.68e+06 1.6 0.0e+00 0.0e+00
> 0.0e+00 19 24 0 0 0 19 24 0 0 0 1908
> MatLUFactorNum 1 1.0 3.6880e-04 1.4 5.81e+04 1.7 0.0e+00 0.0e+00
> 0.0e+00 1 0 0 0 0 1 0 0 0 0 499
> MatILUFactorSym 1 1.0 1.7040e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00 1 0 0 0 0 1 0 0 0 0 0
> MatAssemblyBegin 1 1.0 2.5113e-04 1.9 0.00e+00 0.0 0.0e+00 0.0e+00
> 2.0e+00 1 0 0 0 1 1 0 0 0 1 0
> MatAssemblyEnd 1 1.0 1.8365e-03 1.0 0.00e+00 0.0 1.6e+01 1.1e+02
> 8.0e+00 6 0 3 1 3 6 0 3 1 3 0
> MatGetRowIJ 1 1.0 2.2865e-0517.8 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> MatGetOrdering 1 1.0 6.1687e-05 1.6 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> VecTDot 150 1.0 3.2991e-03 2.7 2.05e+06 1.0 0.0e+00 0.0e+00
> 1.5e+02 8 17 0 0 62 8 17 0 0 62 2466
> VecNorm 76 1.0 7.5034e-03 1.0 1.04e+06 1.0 0.0e+00 0.0e+00
> 7.6e+01 26 9 0 0 31 26 9 0 0 31 549
> VecSet 77 1.0 2.4495e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00 1 0 0 0 0 1 0 0 0 0 0
> VecAXPY 150 1.0 7.8158e-04 1.1 2.05e+06 1.0 0.0e+00 0.0e+00
> 0.0e+00 3 17 0 0 0 3 17 0 0 0 10409
> VecAYPX 74 1.0 6.8849e-04 1.0 1.01e+06 1.0 0.0e+00 0.0e+00
> 0.0e+00 2 8 0 0 0 2 8 0 0 0 5829
> VecScatterBegin 75 1.0 1.7794e-04 1.2 0.00e+00 0.0 6.0e+02 4.4e+02
> 0.0e+00 1 0 97 99 0 1 0 97 99 0 0
> VecScatterEnd 75 1.0 2.1674e-04 1.2 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00 1 0 0 0 0 1 0 0 0 0 0
> KSPSetUp 2 1.0 1.4922e-04 4.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> KSPSolve 1 1.0 2.2833e-02 1.0 1.36e+07 1.3 6.0e+02 4.4e+02
> 2.3e+02 79100 97 99 93 79100 97 99 93 2116
> PCSetUp 2 1.0 1.0116e-03 1.2 5.81e+04 1.7 0.0e+00 0.0e+00
> 0.0e+00 3 0 0 0 0 3 0 0 0 0 182
> PCSetUpOnBlocks 1 1.0 6.2872e-04 1.2 5.81e+04 1.7 0.0e+00 0.0e+00
> 0.0e+00 2 0 0 0 0 2 0 0 0 0 293
> PCApply 75 1.0 7.2835e-03 1.3 3.68e+06 1.6 0.0e+00 0.0e+00
> 0.0e+00 22 24 0 0 0 22 24 0 0 0 1624
>
> MatMult 77 1.0 3.5985e-03 1.2 2.18e+06 2.4 1.5e+03 3.8e+02
> 0.0e+00 1 25 97 99 0 1 25 97 99 0 3393
> MatSolve 77 1.0 3.8145e-03 1.4 2.16e+06 2.4 0.0e+00 0.0e+00
> 0.0e+00 1 24 0 0 0 1 24 0 0 0 3163
> MatLUFactorNum 1 1.0 9.3037e-04 1.9 3.37e+04 2.6 0.0e+00 0.0e+00
> 0.0e+00 0 0 0 0 0 0 0 0 0 0 196
> MatILUFactorSym 1 1.0 2.1638e-03 3.9 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> MatAssemblyBegin 1 1.0 1.9466e-03 1.5 0.00e+00 0.0 0.0e+00 0.0e+00
> 2.0e+00 1 0 0 0 1 1 0 0 0 1 0
> MatAssemblyEnd 1 1.0 2.1234e-02 1.0 0.00e+00 0.0 4.0e+01 9.6e+01
> 8.0e+00 8 0 3 1 3 8 0 3 1 3 0
> MatGetRowIJ 1 1.0 1.0025e-0312.9 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> MatGetOrdering 1 1.0 1.4848e-03 2.8 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> VecTDot 154 1.0 4.1220e-03 1.8 1.06e+06 1.0 0.0e+00 0.0e+00
> 1.5e+02 1 17 0 0 62 1 17 0 0 62 2026
> VecNorm 78 1.0 1.5534e-01 1.0 5.38e+05 1.0 0.0e+00 0.0e+00
> 7.8e+01 60 9 0 0 31 60 9 0 0 31 27
> VecSet 79 1.0 1.5549e-03 1.6 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> VecAXPY 154 1.0 8.0559e-04 1.2 1.06e+06 1.0 0.0e+00 0.0e+00
> 0.0e+00 0 17 0 0 0 0 17 0 0 0 10368
> VecAYPX 76 1.0 5.8600e-04 1.4 5.24e+05 1.0 0.0e+00 0.0e+00
> 0.0e+00 0 8 0 0 0 0 8 0 0 0 7034
> VecScatterBegin 77 1.0 8.4793e-04 3.7 0.00e+00 0.0 1.5e+03 3.8e+02
> 0.0e+00 0 0 97 99 0 0 0 97 99 0 0
> VecScatterEnd 77 1.0 7.7019e-04 2.4 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> KSPSetUp 2 1.0 1.1451e-03 3.8 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> KSPSolve 1 1.0 1.8231e-01 1.0 7.49e+06 1.5 1.5e+03 3.8e+02
> 2.3e+02 71100 97 99 93 71100 97 99 94 272
> PCSetUp 2 1.0 1.0994e-02 1.1 3.37e+04 2.6 0.0e+00 0.0e+00
> 0.0e+00 4 0 0 0 0 4 0 0 0 0 17
> PCSetUpOnBlocks 1 1.0 4.9001e-03 1.2 3.37e+04 2.6 0.0e+00 0.0e+00
> 0.0e+00 2 0 0 0 0 2 0 0 0 0 37
> PCApply 77 1.0 5.2556e-03 1.3 2.16e+06 2.4 0.0e+00 0.0e+00
> 0.0e+00 2 24 0 0 0 2 24 0 0 0 2296
>
> MatMult 78 1.0 1.2783e-02 4.8 1.16e+06 3.9 3.5e+03 2.5e+02
> 0.0e+00 1 25 98 99 0 1 25 98 99 0 968
> MatSolve 78 1.0 1.4015e-0214.0 1.14e+06 3.9 0.0e+00 0.0e+00
> 0.0e+00 0 24 0 0 0 0 24 0 0 0 867
> MatLUFactorNum 1 1.0 1.0275e-0240.1 1.76e+04 4.5 0.0e+00 0.0e+00
> 0.0e+00 0 0 0 0 0 0 0 0 0 0 18
> MatILUFactorSym 1 1.0 2.0541e-0213.3 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00 1 0 0 0 0 1 0 0 0 0 0
> MatAssemblyBegin 1 1.0 2.1347e-02 3.4 0.00e+00 0.0 0.0e+00 0.0e+00
> 2.0e+00 1 0 0 0 1 1 0 0 0 1 0
> MatAssemblyEnd 1 1.0 1.5367e-01 1.1 0.00e+00 0.0 9.0e+01 6.5e+01
> 8.0e+00 12 0 2 1 3 12 0 2 1 3 0
> MatGetRowIJ 1 1.0 1.2759e-02159.9 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> MatGetOrdering 1 1.0 1.8199e-0221.2 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00 1 0 0 0 0 1 0 0 0 0 0
> VecTDot 156 1.0 1.3093e-02 6.1 5.45e+05 1.0 0.0e+00 0.0e+00
> 1.6e+02 1 17 0 0 62 1 17 0 0 62 646
> VecNorm 79 1.0 5.2373e-01 1.0 2.76e+05 1.0 0.0e+00 0.0e+00
> 7.9e+01 42 9 0 0 31 42 9 0 0 31 8
> VecSet 80 1.0 2.1215e-0229.2 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00 1 0 0 0 0 1 0 0 0 0 0
> VecAXPY 156 1.0 2.5283e-03 1.7 5.45e+05 1.0 0.0e+00 0.0e+00
> 0.0e+00 0 17 0 0 0 0 17 0 0 0 3346
> VecAYPX 77 1.0 1.5826e-03 2.6 2.69e+05 1.0 0.0e+00 0.0e+00
> 0.0e+00 0 8 0 0 0 0 8 0 0 0 2639
> VecScatterBegin 78 1.0 7.8273e-0326.8 0.00e+00 0.0 3.5e+03 2.5e+02
> 0.0e+00 0 0 98 99 0 0 0 98 99 0 0
> VecScatterEnd 78 1.0 4.8130e-0344.8 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> KSPSetUp 2 1.0 1.9786e-0232.6 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00 1 0 0 0 0 1 0 0 0 0 0
> KSPSolve 1 1.0 6.7540e-01 1.0 3.87e+06 1.8 3.5e+03 2.5e+02
> 2.4e+02 54100 98 99 93 54100 98 99 94 74
> PCSetUp 2 1.0 9.6539e-02 1.2 1.76e+04 4.5 0.0e+00 0.0e+00
> 0.0e+00 7 0 0 0 0 7 0 0 0 0 2
> PCSetUpOnBlocks 1 1.0 5.1548e-02 1.8 1.76e+04 4.5 0.0e+00 0.0e+00
> 0.0e+00 3 0 0 0 0 3 0 0 0 0 4
> PCApply 78 1.0 1.7296e-02 5.3 1.14e+06 3.9 0.0e+00 0.0e+00
> 0.0e+00 1 24 0 0 0 1 24 0 0 0 702
> ------------------------------------------------------------
> ------------------------------------------------------------
>
>
>
>
> > On Nov 17, 2016, at 6:28 AM, Ivano Barletta <ibarletta at inogs.it> wrote:
> >
> > Dear Petsc users
> >
> > My aim is to replace the linear solver of an ocean model with Petsc, to
> see if is
> > there place for improvement of performances.
> >
> > The linear system solves an elliptic equation, and the former solver is a
> > Preconditioned Conjugate Gradient, with a simple diagonal
> preconditioning.
> > The size of the matrix is roughly 27000
> >
> > Prior to nest Petsc into the model, I've built a simple test case, where
> > the same system is solved by the two of the methods
> >
> > I've noticed that, compared to the former solver (pcg), Petsc performance
> > results are quite disappointing
> >
> > Pcg does not scale that much, but its solution time remains below
> > 4-5e-2 seconds.
> > Petsc solution time, instead, the more cpu I use the more increases
> > (see output of -log_view in attachment ).
> >
> > I've only tried to change the ksp solver ( gmres, cg, and bcgs with no
> > improvement) and preconditioning is the default of Petsc. Maybe these
> > options don't suit my problem very well, but I don't think this alone
> > justifies this strange behavior
> >
> > I've tried to provide d_nnz and o_nnz for the exact number of nonzeros
> in the
> > Preallocation phase, but no gain even in this case.
> >
> > At this point, my question is, what am I doing wrong?
> >
> > Do you think that the problem is too small for the Petsc to
> > have any effect?
> >
> > Thanks in advance
> > Ivano
> >
> > <petsc_time_8><petsc_time_4><petsc_time_16>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20161117/fbf8bb18/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: petsc_log_8
Type: application/octet-stream
Size: 21848 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20161117/fbf8bb18/attachment-0003.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: petsc_log_4
Type: application/octet-stream
Size: 21633 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20161117/fbf8bb18/attachment-0004.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: petsc_log_16
Type: application/octet-stream
Size: 22089 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20161117/fbf8bb18/attachment-0005.obj>
More information about the petsc-users
mailing list