[petsc-users] Bad memory scaling with PETSc 3.10

Myriam Peyrounette myriam.peyrounette at idris.fr
Tue Mar 26 08:26:54 CDT 2019


I checked with -ksp_view (attached) but no prefix is associated with the
matrix. Some are associated to the KSP and PC, but none to the Mat.


Le 03/26/19 à 11:55, Dave May a écrit :
>
>
> On Tue, 26 Mar 2019 at 10:36, Myriam Peyrounette
> <myriam.peyrounette at idris.fr <mailto:myriam.peyrounette at idris.fr>> wrote:
>
>     Oh you were right, the three options are unsused (-matptap_via
>     scalable, -inner_offdiag_matmatmult_via scalable and
>     -inner_diag_matmatmult_via scalable). Does this mean I am not
>     using the associated PtAP functions?
>
>
> No - not necessarily. All it means is the options were not parsed. 
>
> If your matrices have an option prefix associated with them (e.g. abc)
> , then you need to provide the option as
>   -abc_matptap_via scalable
>
> If you are not sure if you matrices have a prefix, look at the result
> of -ksp_view (see below for an example)
>
>   Mat Object: 2 MPI processes
>
>     type: mpiaij
>
>     rows=363, cols=363, bs=3
>
>     total: nonzeros=8649, allocated nonzeros=8649
>
>     total number of mallocs used during MatSetValues calls =0
>
>   Mat Object: (B_) 2 MPI processes
>
>     type: mpiaij
>
>     rows=363, cols=363, bs=3
>
>     total: nonzeros=8649, allocated nonzeros=8649
>
>     total number of mallocs used during MatSetValues calls =0
>
>
> The first matrix has no options prefix, but the second does and it's
> called "B_".
>
>
>
>  
>
>     Myriam
>
>
>     Le 03/26/19 à 11:10, Dave May a écrit :
>>
>>     On Tue, 26 Mar 2019 at 09:52, Myriam Peyrounette via petsc-users
>>     <petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov>> wrote:
>>
>>         How can I be sure they are indeed used? Can I print this
>>         information in some log file?
>>
>>     Yes. Re-run the job with the command line option
>>
>>     -options_left true
>>
>>     This will report all options parsed, and importantly, will also
>>     indicate if any options were unused.
>>      
>>
>>     Thanks
>>     Dave
>>
>>         Thanks in advance
>>
>>         Myriam
>>
>>
>>         Le 03/25/19 à 18:24, Matthew Knepley a écrit :
>>>         On Mon, Mar 25, 2019 at 10:54 AM Myriam Peyrounette via
>>>         petsc-users <petsc-users at mcs.anl.gov
>>>         <mailto:petsc-users at mcs.anl.gov>> wrote:
>>>
>>>             Hi,
>>>
>>>             thanks for the explanations. I tried the last PETSc
>>>             version (commit
>>>             fbc5705bc518d02a4999f188aad4ccff5f754cbf), which
>>>             includes the patch you talked about. But the memory
>>>             scaling shows no improvement (see scaling attached),
>>>             even when using the "scalable" options :(
>>>
>>>             I had a look at the PETSc functions
>>>             MatPtAPNumeric_MPIAIJ_MPIAIJ and
>>>             MatPtAPSymbolic_MPIAIJ_MPIAIJ (especially at the
>>>             differences before and after the first "bad" commit),
>>>             but I can't find what induced this memory issue.
>>>
>>>         Are you sure that the option was used? It just looks
>>>         suspicious to me that they use exactly the same amount of
>>>         memory. It should be different, even if it does not solve
>>>         the problem.
>>>
>>>            Thanks,
>>>
>>>              Matt 
>>>
>>>             Myriam
>>>
>>>
>>>
>>>
>>>             Le 03/20/19 à 17:38, Fande Kong a écrit :
>>>>             Hi Myriam,
>>>>
>>>>             There are three algorithms in PETSc to do PtAP ( const
>>>>             char          *algTypes[3] =
>>>>             {"scalable","nonscalable","hypre"};), and can be
>>>>             specified using the petsc options: -matptap_via xxxx.
>>>>
>>>>             (1) -matptap_via hypre: This call the hypre package to
>>>>             do the PtAP trough an all-at-once triple product. In
>>>>             our experiences, it is the most memory efficient, but
>>>>             could be slow.
>>>>
>>>>             (2)  -matptap_via scalable: This involves a row-wise
>>>>             algorithm plus an outer product.  This will use more
>>>>             memory than hypre, but way faster. This used to have a
>>>>             bug that could take all your memory, and I have a fix
>>>>             at https://bitbucket.org/petsc/petsc/pull-requests/1452/mpiptap-enable-large-scale-simulations/diff. 
>>>>             When using this option, we may want to have extra
>>>>             options such as   -inner_offdiag_matmatmult_via
>>>>             scalable -inner_diag_matmatmult_via scalable  to select
>>>>             inner scalable algorithms.
>>>>
>>>>             (3)  -matptap_via nonscalable:  Suppose to be even
>>>>             faster, but use more memory. It does dense matrix
>>>>             operations.
>>>>
>>>>
>>>>             Thanks,
>>>>
>>>>             Fande Kong
>>>>
>>>>
>>>>
>>>>
>>>>             On Wed, Mar 20, 2019 at 10:06 AM Myriam Peyrounette via
>>>>             petsc-users <petsc-users at mcs.anl.gov
>>>>             <mailto:petsc-users at mcs.anl.gov>> wrote:
>>>>
>>>>                 More precisely: something happens when upgrading
>>>>                 the functions MatPtAPNumeric_MPIAIJ_MPIAIJ and/or
>>>>                 MatPtAPSymbolic_MPIAIJ_MPIAIJ.
>>>>
>>>>                 Unfortunately, there are a lot of differences
>>>>                 between the old and new versions of these
>>>>                 functions. I keep investigating but if you have any
>>>>                 idea, please let me know.
>>>>
>>>>                 Best,
>>>>
>>>>                 Myriam
>>>>
>>>>
>>>>                 Le 03/20/19 à 13:48, Myriam Peyrounette a écrit :
>>>>>
>>>>>                 Hi all,
>>>>>
>>>>>                 I used git bisect to determine when the memory
>>>>>                 need increased. I found that the first "bad"
>>>>>                 commit is   aa690a28a7284adb519c28cb44eae20a2c131c85.
>>>>>
>>>>>                 Barry was right, this commit seems to be about an
>>>>>                 evolution of MatPtAPSymbolic_MPIAIJ_MPIAIJ. You
>>>>>                 mentioned the option "-matptap_via scalable" but I
>>>>>                 can't find any information about it. Can you tell
>>>>>                 me more?
>>>>>
>>>>>                 Thanks
>>>>>
>>>>>                 Myriam
>>>>>
>>>>>
>>>>>                 Le 03/11/19 à 14:40, Mark Adams a écrit :
>>>>>>                 Is there a difference in memory usage on your
>>>>>>                 tiny problem? I assume no.
>>>>>>
>>>>>>                 I don't see anything that could come from GAMG
>>>>>>                 other than the RAP stuff that you have discussed
>>>>>>                 already.
>>>>>>
>>>>>>                 On Mon, Mar 11, 2019 at 9:32 AM Myriam
>>>>>>                 Peyrounette <myriam.peyrounette at idris.fr
>>>>>>                 <mailto:myriam.peyrounette at idris.fr>> wrote:
>>>>>>
>>>>>>                     The code I am using here is the example 42 of
>>>>>>                     PETSc
>>>>>>                     (https://www.mcs.anl.gov/petsc/petsc-3.9/src/ksp/ksp/examples/tutorials/ex42.c.html).
>>>>>>                     Indeed it solves the Stokes equation. I
>>>>>>                     thought it was a good idea to use an example
>>>>>>                     you might know (and didn't find any that uses
>>>>>>                     GAMG functions). I just changed the PCMG
>>>>>>                     setup so that the memory problem appears. And
>>>>>>                     it appears when adding PCGAMG.
>>>>>>
>>>>>>                     I don't care about the performance or even
>>>>>>                     the result rightness here, but only about the
>>>>>>                     difference in memory use between 3.6 and
>>>>>>                     3.10. Do you think finding a more adapted
>>>>>>                     script would help?
>>>>>>
>>>>>>                     I used the threshold of 0.1 only once, at the
>>>>>>                     beginning, to test its influence. I used the
>>>>>>                     default threshold (of 0, I guess) for all the
>>>>>>                     other runs.
>>>>>>
>>>>>>                     Myriam
>>>>>>
>>>>>>
>>>>>>                     Le 03/11/19 à 13:52, Mark Adams a écrit :
>>>>>>>                     In looking at this larger scale run ...
>>>>>>>
>>>>>>>                     * Your eigen estimates are much lower than
>>>>>>>                     your tiny test problem.  But this is Stokes
>>>>>>>                     apparently and it should not work anyway.
>>>>>>>                     Maybe you have a small time step that adds a
>>>>>>>                     lot of mass that brings the eigen estimates
>>>>>>>                     down. And your min eigenvalue (not used) is
>>>>>>>                     positive. I would expect negative for Stokes ...
>>>>>>>
>>>>>>>                     * You seem to be setting a threshold value
>>>>>>>                     of 0.1 -- that is very high
>>>>>>>
>>>>>>>                     * v3.6 says "using nonzero initial guess"
>>>>>>>                     but this is not in v3.10. Maybe we just
>>>>>>>                     stopped printing that.
>>>>>>>
>>>>>>>                     * There were some changes to coasening
>>>>>>>                     parameters in going from v3.6 but it does
>>>>>>>                     not look like your problem was effected.
>>>>>>>                     (The coarsening algo is non-deterministic by
>>>>>>>                     default and you can see small difference on
>>>>>>>                     different runs)
>>>>>>>
>>>>>>>                     * We may have also added a "noisy" RHS for
>>>>>>>                     eigen estimates by default from v3.6.
>>>>>>>
>>>>>>>                     * And for non-symetric problems you can try
>>>>>>>                     -pc_gamg_agg_nsmooths 0, but again GAMG is
>>>>>>>                     not built for Stokes anyway.
>>>>>>>
>>>>>>>
>>>>>>>                     On Tue, Mar 5, 2019 at 11:53 AM Myriam
>>>>>>>                     Peyrounette <myriam.peyrounette at idris.fr
>>>>>>>                     <mailto:myriam.peyrounette at idris.fr>> wrote:
>>>>>>>
>>>>>>>                         I used PCView to display the size of the
>>>>>>>                         linear system in each level of the MG.
>>>>>>>                         You'll find the outputs attached to this
>>>>>>>                         mail (zip file) for both the default
>>>>>>>                         threshold value and a value of 0.1, and
>>>>>>>                         for both 3.6 and 3.10 PETSc versions.
>>>>>>>
>>>>>>>                         For convenience, I summarized the
>>>>>>>                         information in a graph, also attached
>>>>>>>                         (png file).
>>>>>>>
>>>>>>>                         As you can see, there are slight
>>>>>>>                         differences between the two versions but
>>>>>>>                         none is critical, in my opinion. Do you
>>>>>>>                         see anything suspicious in the outputs?
>>>>>>>
>>>>>>>                         + I can't find the default threshold
>>>>>>>                         value. Do you know where I can find it?
>>>>>>>
>>>>>>>                         Thanks for the follow-up
>>>>>>>
>>>>>>>                         Myriam
>>>>>>>
>>>>>>>
>>>>>>>                         Le 03/05/19 à 14:06, Matthew Knepley a
>>>>>>>                         écrit :
>>>>>>>>                         On Tue, Mar 5, 2019 at 7:14 AM Myriam
>>>>>>>>                         Peyrounette
>>>>>>>>                         <myriam.peyrounette at idris.fr
>>>>>>>>                         <mailto:myriam.peyrounette at idris.fr>>
>>>>>>>>                         wrote:
>>>>>>>>
>>>>>>>>                             Hi Matt,
>>>>>>>>
>>>>>>>>                             I plotted the memory scalings using
>>>>>>>>                             different threshold values. The two
>>>>>>>>                             scalings are slightly translated
>>>>>>>>                             (from -22 to -88 mB) but this gain
>>>>>>>>                             is neglectable. The 3.6-scaling
>>>>>>>>                             keeps being robust while the
>>>>>>>>                             3.10-scaling deteriorates.
>>>>>>>>
>>>>>>>>                             Do you have any other suggestion?
>>>>>>>>
>>>>>>>>                         Mark, what is the option she can give
>>>>>>>>                         to output all the GAMG data?
>>>>>>>>
>>>>>>>>                         Also, run using -ksp_view. GAMG will
>>>>>>>>                         report all the sizes of its grids, so
>>>>>>>>                         it should be easy to see
>>>>>>>>                         if the coarse grid sizes are
>>>>>>>>                         increasing, and also what the effect of
>>>>>>>>                         the threshold value is.
>>>>>>>>
>>>>>>>>                           Thanks,
>>>>>>>>
>>>>>>>>                              Matt 
>>>>>>>>
>>>>>>>>                             Thanks
>>>>>>>>
>>>>>>>>                             Myriam
>>>>>>>>
>>>>>>>>                             Le 03/02/19 à 02:27, Matthew
>>>>>>>>                             Knepley a écrit :
>>>>>>>>>                             On Fri, Mar 1, 2019 at 10:53 AM
>>>>>>>>>                             Myriam Peyrounette via petsc-users
>>>>>>>>>                             <petsc-users at mcs.anl.gov
>>>>>>>>>                             <mailto:petsc-users at mcs.anl.gov>>
>>>>>>>>>                             wrote:
>>>>>>>>>
>>>>>>>>>                                 Hi,
>>>>>>>>>
>>>>>>>>>                                 I used to run my code with
>>>>>>>>>                                 PETSc 3.6. Since I upgraded
>>>>>>>>>                                 the PETSc version
>>>>>>>>>                                 to 3.10, this code has a bad
>>>>>>>>>                                 memory scaling.
>>>>>>>>>
>>>>>>>>>                                 To report this issue, I took
>>>>>>>>>                                 the PETSc script ex42.c and
>>>>>>>>>                                 slightly
>>>>>>>>>                                 modified it so that the KSP
>>>>>>>>>                                 and PC configurations are the
>>>>>>>>>                                 same as in my
>>>>>>>>>                                 code. In particular, I use a
>>>>>>>>>                                 "personnalised" multi-grid
>>>>>>>>>                                 method. The
>>>>>>>>>                                 modifications are indicated by
>>>>>>>>>                                 the keyword "TopBridge" in the
>>>>>>>>>                                 attached
>>>>>>>>>                                 scripts.
>>>>>>>>>
>>>>>>>>>                                 To plot the memory (weak)
>>>>>>>>>                                 scaling, I ran four
>>>>>>>>>                                 calculations for each
>>>>>>>>>                                 script with increasing problem
>>>>>>>>>                                 sizes and computations cores:
>>>>>>>>>
>>>>>>>>>                                 1. 100,000 elts on 4 cores
>>>>>>>>>                                 2. 1 million elts on 40 cores
>>>>>>>>>                                 3. 10 millions elts on 400 cores
>>>>>>>>>                                 4. 100 millions elts on 4,000
>>>>>>>>>                                 cores
>>>>>>>>>
>>>>>>>>>                                 The resulting graph is also
>>>>>>>>>                                 attached. The scaling using
>>>>>>>>>                                 PETSc 3.10
>>>>>>>>>                                 clearly deteriorates for large
>>>>>>>>>                                 cases, while the one using
>>>>>>>>>                                 PETSc 3.6 is
>>>>>>>>>                                 robust.
>>>>>>>>>
>>>>>>>>>                                 After a few tests, I found
>>>>>>>>>                                 that the scaling is mostly
>>>>>>>>>                                 sensitive to the
>>>>>>>>>                                 use of the AMG method for the
>>>>>>>>>                                 coarse grid (line 1780 in
>>>>>>>>>                                 main_ex42_petsc36.cc). In
>>>>>>>>>                                 particular, the performance
>>>>>>>>>                                 strongly
>>>>>>>>>                                 deteriorates when commenting
>>>>>>>>>                                 lines 1777 to 1790 (in
>>>>>>>>>                                 main_ex42_petsc36.cc).
>>>>>>>>>
>>>>>>>>>                                 Do you have any idea of what
>>>>>>>>>                                 changed between version 3.6
>>>>>>>>>                                 and version
>>>>>>>>>                                 3.10 that may imply such
>>>>>>>>>                                 degradation?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>                             I believe the default values for
>>>>>>>>>                             PCGAMG changed between versions.
>>>>>>>>>                             It sounds like the coarsening rate
>>>>>>>>>                             is not great enough, so that these
>>>>>>>>>                             grids are too large. This can be
>>>>>>>>>                             set using:
>>>>>>>>>
>>>>>>>>>                               https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCGAMGSetThreshold.html
>>>>>>>>>
>>>>>>>>>                             There is some explanation of this
>>>>>>>>>                             effect on that page. Let us know
>>>>>>>>>                             if setting this does not correct
>>>>>>>>>                             the situation.
>>>>>>>>>
>>>>>>>>>                               Thanks,
>>>>>>>>>
>>>>>>>>>                                  Matt
>>>>>>>>>                              
>>>>>>>>>
>>>>>>>>>                                 Let me know if you need
>>>>>>>>>                                 further information.
>>>>>>>>>
>>>>>>>>>                                 Best,
>>>>>>>>>
>>>>>>>>>                                 Myriam Peyrounette
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>                                 -- 
>>>>>>>>>                                 Myriam Peyrounette
>>>>>>>>>                                 CNRS/IDRIS - HLST
>>>>>>>>>                                 --
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>                             -- 
>>>>>>>>>                             What most experimenters take for
>>>>>>>>>                             granted before they begin their
>>>>>>>>>                             experiments is infinitely more
>>>>>>>>>                             interesting than any results to
>>>>>>>>>                             which their experiments lead.
>>>>>>>>>                             -- Norbert Wiener
>>>>>>>>>
>>>>>>>>>                             https://www.cse.buffalo.edu/~knepley/
>>>>>>>>>                             <http://www.cse.buffalo.edu/%7Eknepley/>
>>>>>>>>
>>>>>>>>                             -- 
>>>>>>>>                             Myriam Peyrounette
>>>>>>>>                             CNRS/IDRIS - HLST
>>>>>>>>                             --
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>                         -- 
>>>>>>>>                         What most experimenters take for
>>>>>>>>                         granted before they begin their
>>>>>>>>                         experiments is infinitely more
>>>>>>>>                         interesting than any results to which
>>>>>>>>                         their experiments lead.
>>>>>>>>                         -- Norbert Wiener
>>>>>>>>
>>>>>>>>                         https://www.cse.buffalo.edu/~knepley/
>>>>>>>>                         <http://www.cse.buffalo.edu/%7Eknepley/>
>>>>>>>
>>>>>>>                         -- 
>>>>>>>                         Myriam Peyrounette
>>>>>>>                         CNRS/IDRIS - HLST
>>>>>>>                         --
>>>>>>>
>>>>>>
>>>>>>                     -- 
>>>>>>                     Myriam Peyrounette
>>>>>>                     CNRS/IDRIS - HLST
>>>>>>                     --
>>>>>>
>>>>>
>>>>>                 -- 
>>>>>                 Myriam Peyrounette
>>>>>                 CNRS/IDRIS - HLST
>>>>>                 --
>>>>
>>>>                 -- 
>>>>                 Myriam Peyrounette
>>>>                 CNRS/IDRIS - HLST
>>>>                 --
>>>>
>>>
>>>             -- 
>>>             Myriam Peyrounette
>>>             CNRS/IDRIS - HLST
>>>             --
>>>
>>>
>>>
>>>         -- 
>>>         What most experimenters take for granted before they begin
>>>         their experiments is infinitely more interesting than any
>>>         results to which their experiments lead.
>>>         -- Norbert Wiener
>>>
>>>         https://www.cse.buffalo.edu/~knepley/
>>>         <http://www.cse.buffalo.edu/%7Eknepley/>
>>
>>         -- 
>>         Myriam Peyrounette
>>         CNRS/IDRIS - HLST
>>         --
>>
>
>     -- 
>     Myriam Peyrounette
>     CNRS/IDRIS - HLST
>     --
>

-- 
Myriam Peyrounette
CNRS/IDRIS - HLST
--

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20190326/26010c5f/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ksp.log
Type: text/x-log
Size: 22348 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20190326/26010c5f/attachment-0001.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2975 bytes
Desc: Signature cryptographique S/MIME
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20190326/26010c5f/attachment-0001.p7s>


More information about the petsc-users mailing list