[petsc-users] Suspect Poor Performance of Petsc

Ivano Barletta ibarletta at inogs.it
Thu Nov 17 12:25:46 CST 2016


Thank you for your replies

I've carried out 3 new tests with 4,8,16 cores, adding the
code lines suggested by Barry (logs in attachment )

Still lack of scaling persist, but maybe is just related to the size
of the problem.

As far as balancing concerns, there's not very much I can do, I
believe it depends on land/sea points, (on land matrix coefficients
are zero), it's something I cannot control.

By the way, I mean to carry some test on higher resolution
configurations, with a bigger problem size

Kind Regards
Ivano


2016-11-17 17:24 GMT+01:00 Barry Smith <bsmith at mcs.anl.gov>:

>
>   Ivano,
>
> I have cut and pasted the relevant parts of the logs below and removed a
> few irrelevent lines to make the analysis simplier.
>
> There is a lot of bad stuff going on that is hurting performance.
>
> 1) The percentage of the time in the linear solve is getting lower going
> from 79% with 4 processes to 54 % with 16 processes. This means the rest of
> the code is not scaling well, likely that is generating the matrix. How are
> you getting the matrix into the program? If you are reading it as ASCII
> (somehow in parallel?) you should not do that. You should use MatLoad() to
> get the matrix in efficiently (see for example src/ksp/ksp/examples/
> tutorials/ex10.c).
>
> To make the analysis better you need to add around the KSP solve.
>
>   ierr = PetscLogStageRegister("Solve", &stage);CHKERRQ(ierr);
>   ierr = PetscLogStagePush(stage);CHKERRQ(ierr);
>   ierr = KSPSolve(ksp,bb,xx);CHKERRQ(ierr);
>   ierr = PetscLogStagePop();CHKERRQ(ierr);
>
> and rerun the three cases.
>
> 2) The load balance is bad even for four processes. For example it is 1.3
> in the MatSolve, it should be really close to 1.0. How are you dividing the
> matrix up between processes?
>
> 3) It is spending a HUGE amount of time in VecNorm(), 26 % on 4 processes
> and 42% on 16 processes. This could be partially or completely due to load
> balancing but might have other issues.
>
> Run with -ksp_norm_type natural in your new sets of runs
>
> Also always run with -ksp_type cg ; it makes no sense to use gmres or the
> other KSP methods.
>
> Eagerly awaiting your response.
>
> Barry
>
>
>
> ------------------------------------------------------------
> ------------------------------------------------------------
> Event                Count      Time (sec)     Flops
>        --- Global ---  --- Stage ---   Total
>                    Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg len
> Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
> ------------------------------------------------------------
> ------------------------------------------------------------
>
> MatMult               75 1.0 4.1466e-03 1.2 3.70e+06 1.6 6.0e+02 4.4e+02
> 0.0e+00 13 25 97 99  0  13 25 97 99  0  2868
> MatSolve              75 1.0 6.1995e-03 1.3 3.68e+06 1.6 0.0e+00 0.0e+00
> 0.0e+00 19 24  0  0  0  19 24  0  0  0  1908
> MatLUFactorNum         1 1.0 3.6880e-04 1.4 5.81e+04 1.7 0.0e+00 0.0e+00
> 0.0e+00  1  0  0  0  0   1  0  0  0  0   499
> MatILUFactorSym        1 1.0 1.7040e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  1  0  0  0  0   1  0  0  0  0     0
> MatAssemblyBegin       1 1.0 2.5113e-04 1.9 0.00e+00 0.0 0.0e+00 0.0e+00
> 2.0e+00  1  0  0  0  1   1  0  0  0  1     0
> MatAssemblyEnd         1 1.0 1.8365e-03 1.0 0.00e+00 0.0 1.6e+01 1.1e+02
> 8.0e+00  6  0  3  1  3   6  0  3  1  3     0
> MatGetRowIJ            1 1.0 2.2865e-0517.8 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatGetOrdering         1 1.0 6.1687e-05 1.6 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> VecTDot              150 1.0 3.2991e-03 2.7 2.05e+06 1.0 0.0e+00 0.0e+00
> 1.5e+02  8 17  0  0 62   8 17  0  0 62  2466
> VecNorm               76 1.0 7.5034e-03 1.0 1.04e+06 1.0 0.0e+00 0.0e+00
> 7.6e+01 26  9  0  0 31  26  9  0  0 31   549
> VecSet                77 1.0 2.4495e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  1  0  0  0  0   1  0  0  0  0     0
> VecAXPY              150 1.0 7.8158e-04 1.1 2.05e+06 1.0 0.0e+00 0.0e+00
> 0.0e+00  3 17  0  0  0   3 17  0  0  0 10409
> VecAYPX               74 1.0 6.8849e-04 1.0 1.01e+06 1.0 0.0e+00 0.0e+00
> 0.0e+00  2  8  0  0  0   2  8  0  0  0  5829
> VecScatterBegin       75 1.0 1.7794e-04 1.2 0.00e+00 0.0 6.0e+02 4.4e+02
> 0.0e+00  1  0 97 99  0   1  0 97 99  0     0
> VecScatterEnd         75 1.0 2.1674e-04 1.2 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  1  0  0  0  0   1  0  0  0  0     0
> KSPSetUp               2 1.0 1.4922e-04 4.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> KSPSolve               1 1.0 2.2833e-02 1.0 1.36e+07 1.3 6.0e+02 4.4e+02
> 2.3e+02 79100 97 99 93  79100 97 99 93  2116
> PCSetUp                2 1.0 1.0116e-03 1.2 5.81e+04 1.7 0.0e+00 0.0e+00
> 0.0e+00  3  0  0  0  0   3  0  0  0  0   182
> PCSetUpOnBlocks        1 1.0 6.2872e-04 1.2 5.81e+04 1.7 0.0e+00 0.0e+00
> 0.0e+00  2  0  0  0  0   2  0  0  0  0   293
> PCApply               75 1.0 7.2835e-03 1.3 3.68e+06 1.6 0.0e+00 0.0e+00
> 0.0e+00 22 24  0  0  0  22 24  0  0  0  1624
>
> MatMult               77 1.0 3.5985e-03 1.2 2.18e+06 2.4 1.5e+03 3.8e+02
> 0.0e+00  1 25 97 99  0   1 25 97 99  0  3393
> MatSolve              77 1.0 3.8145e-03 1.4 2.16e+06 2.4 0.0e+00 0.0e+00
> 0.0e+00  1 24  0  0  0   1 24  0  0  0  3163
> MatLUFactorNum         1 1.0 9.3037e-04 1.9 3.37e+04 2.6 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0   196
> MatILUFactorSym        1 1.0 2.1638e-03 3.9 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatAssemblyBegin       1 1.0 1.9466e-03 1.5 0.00e+00 0.0 0.0e+00 0.0e+00
> 2.0e+00  1  0  0  0  1   1  0  0  0  1     0
> MatAssemblyEnd         1 1.0 2.1234e-02 1.0 0.00e+00 0.0 4.0e+01 9.6e+01
> 8.0e+00  8  0  3  1  3   8  0  3  1  3     0
> MatGetRowIJ            1 1.0 1.0025e-0312.9 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatGetOrdering         1 1.0 1.4848e-03 2.8 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> VecTDot              154 1.0 4.1220e-03 1.8 1.06e+06 1.0 0.0e+00 0.0e+00
> 1.5e+02  1 17  0  0 62   1 17  0  0 62  2026
> VecNorm               78 1.0 1.5534e-01 1.0 5.38e+05 1.0 0.0e+00 0.0e+00
> 7.8e+01 60  9  0  0 31  60  9  0  0 31    27
> VecSet                79 1.0 1.5549e-03 1.6 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> VecAXPY              154 1.0 8.0559e-04 1.2 1.06e+06 1.0 0.0e+00 0.0e+00
> 0.0e+00  0 17  0  0  0   0 17  0  0  0 10368
> VecAYPX               76 1.0 5.8600e-04 1.4 5.24e+05 1.0 0.0e+00 0.0e+00
> 0.0e+00  0  8  0  0  0   0  8  0  0  0  7034
> VecScatterBegin       77 1.0 8.4793e-04 3.7 0.00e+00 0.0 1.5e+03 3.8e+02
> 0.0e+00  0  0 97 99  0   0  0 97 99  0     0
> VecScatterEnd         77 1.0 7.7019e-04 2.4 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> KSPSetUp               2 1.0 1.1451e-03 3.8 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> KSPSolve               1 1.0 1.8231e-01 1.0 7.49e+06 1.5 1.5e+03 3.8e+02
> 2.3e+02 71100 97 99 93  71100 97 99 94   272
> PCSetUp                2 1.0 1.0994e-02 1.1 3.37e+04 2.6 0.0e+00 0.0e+00
> 0.0e+00  4  0  0  0  0   4  0  0  0  0    17
> PCSetUpOnBlocks        1 1.0 4.9001e-03 1.2 3.37e+04 2.6 0.0e+00 0.0e+00
> 0.0e+00  2  0  0  0  0   2  0  0  0  0    37
> PCApply               77 1.0 5.2556e-03 1.3 2.16e+06 2.4 0.0e+00 0.0e+00
> 0.0e+00  2 24  0  0  0   2 24  0  0  0  2296
>
> MatMult               78 1.0 1.2783e-02 4.8 1.16e+06 3.9 3.5e+03 2.5e+02
> 0.0e+00  1 25 98 99  0   1 25 98 99  0   968
> MatSolve              78 1.0 1.4015e-0214.0 1.14e+06 3.9 0.0e+00 0.0e+00
> 0.0e+00  0 24  0  0  0   0 24  0  0  0   867
> MatLUFactorNum         1 1.0 1.0275e-0240.1 1.76e+04 4.5 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0    18
> MatILUFactorSym        1 1.0 2.0541e-0213.3 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  1  0  0  0  0   1  0  0  0  0     0
> MatAssemblyBegin       1 1.0 2.1347e-02 3.4 0.00e+00 0.0 0.0e+00 0.0e+00
> 2.0e+00  1  0  0  0  1   1  0  0  0  1     0
> MatAssemblyEnd         1 1.0 1.5367e-01 1.1 0.00e+00 0.0 9.0e+01 6.5e+01
> 8.0e+00 12  0  2  1  3  12  0  2  1  3     0
> MatGetRowIJ            1 1.0 1.2759e-02159.9 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatGetOrdering         1 1.0 1.8199e-0221.2 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  1  0  0  0  0   1  0  0  0  0     0
> VecTDot              156 1.0 1.3093e-02 6.1 5.45e+05 1.0 0.0e+00 0.0e+00
> 1.6e+02  1 17  0  0 62   1 17  0  0 62   646
> VecNorm               79 1.0 5.2373e-01 1.0 2.76e+05 1.0 0.0e+00 0.0e+00
> 7.9e+01 42  9  0  0 31  42  9  0  0 31     8
> VecSet                80 1.0 2.1215e-0229.2 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  1  0  0  0  0   1  0  0  0  0     0
> VecAXPY              156 1.0 2.5283e-03 1.7 5.45e+05 1.0 0.0e+00 0.0e+00
> 0.0e+00  0 17  0  0  0   0 17  0  0  0  3346
> VecAYPX               77 1.0 1.5826e-03 2.6 2.69e+05 1.0 0.0e+00 0.0e+00
> 0.0e+00  0  8  0  0  0   0  8  0  0  0  2639
> VecScatterBegin       78 1.0 7.8273e-0326.8 0.00e+00 0.0 3.5e+03 2.5e+02
> 0.0e+00  0  0 98 99  0   0  0 98 99  0     0
> VecScatterEnd         78 1.0 4.8130e-0344.8 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> KSPSetUp               2 1.0 1.9786e-0232.6 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  1  0  0  0  0   1  0  0  0  0     0
> KSPSolve               1 1.0 6.7540e-01 1.0 3.87e+06 1.8 3.5e+03 2.5e+02
> 2.4e+02 54100 98 99 93  54100 98 99 94    74
> PCSetUp                2 1.0 9.6539e-02 1.2 1.76e+04 4.5 0.0e+00 0.0e+00
> 0.0e+00  7  0  0  0  0   7  0  0  0  0     2
> PCSetUpOnBlocks        1 1.0 5.1548e-02 1.8 1.76e+04 4.5 0.0e+00 0.0e+00
> 0.0e+00  3  0  0  0  0   3  0  0  0  0     4
> PCApply               78 1.0 1.7296e-02 5.3 1.14e+06 3.9 0.0e+00 0.0e+00
> 0.0e+00  1 24  0  0  0   1 24  0  0  0   702
> ------------------------------------------------------------
> ------------------------------------------------------------
>
>
>
>
> > On Nov 17, 2016, at 6:28 AM, Ivano Barletta <ibarletta at inogs.it> wrote:
> >
> > Dear Petsc users
> >
> > My aim is to replace the linear solver of an ocean model with Petsc, to
> see if is
> > there place for improvement of performances.
> >
> > The linear system solves an elliptic equation, and the former solver is a
> > Preconditioned Conjugate Gradient, with a simple diagonal
> preconditioning.
> > The size of the matrix is roughly 27000
> >
> > Prior to nest Petsc into the model, I've built a simple test case, where
> > the same system is solved by the two of the methods
> >
> > I've noticed that, compared to the former solver (pcg), Petsc performance
> > results are quite disappointing
> >
> > Pcg does not scale that much, but its solution time remains below
> > 4-5e-2 seconds.
> > Petsc solution time, instead, the more cpu I use the more increases
> > (see output of -log_view in attachment ).
> >
> > I've only tried to change the ksp solver ( gmres, cg, and bcgs with no
> > improvement) and preconditioning is the default of Petsc. Maybe these
> > options don't suit my problem very well, but I don't think this alone
> > justifies this strange behavior
> >
> > I've tried to provide d_nnz and o_nnz for the exact number of nonzeros
> in the
> > Preallocation phase, but no gain even in this case.
> >
> > At this point, my question is, what am I doing wrong?
> >
> > Do you think that the problem is too small for the Petsc to
> > have any effect?
> >
> > Thanks in advance
> > Ivano
> >
> > <petsc_time_8><petsc_time_4><petsc_time_16>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20161117/fbf8bb18/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: petsc_log_8
Type: application/octet-stream
Size: 21848 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20161117/fbf8bb18/attachment-0003.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: petsc_log_4
Type: application/octet-stream
Size: 21633 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20161117/fbf8bb18/attachment-0004.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: petsc_log_16
Type: application/octet-stream
Size: 22089 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20161117/fbf8bb18/attachment-0005.obj>


More information about the petsc-users mailing list