[petsc-users] Bad memory scaling with PETSc 3.10

Tue Mar 26 10:16:32 CDT 2019

*SetFromOptions() was not called indeed... Thanks! The code performance
is better now with regard to memory usage!

I still have to plot the memory scaling on bigger cases to see if it has
the same good behaviour as when using the 3.6 version.

I'll let ou know as soon as I have plotted it.

Thanks again

Myriam

Le 03/26/19 à 14:30, Matthew Knepley a écrit :
> On Tue, Mar 26, 2019 at 9:27 AM Myriam Peyrounette
> <myriam.peyrounette at idris.fr <mailto:myriam.peyrounette at idris.fr>> wrote:
>
>     I checked with -ksp_view (attached) but no prefix is associated
>     with the matrix. Some are associated to the KSP and PC, but none
>     to the Mat
>
> Another thing that could prevent options being used is that
> *SetFromOptions() is not called for the object.
>
>   Thanks,
>
>      Matt
>  
>
>     Le 03/26/19 à 11:55, Dave May a écrit :
>>
>>
>>     On Tue, 26 Mar 2019 at 10:36, Myriam Peyrounette
>>     <myriam.peyrounette at idris.fr
>>     <mailto:myriam.peyrounette at idris.fr>> wrote:
>>
>>         Oh you were right, the three options are unsused
>>         (-matptap_via scalable, -inner_offdiag_matmatmult_via
>>         scalable and -inner_diag_matmatmult_via scalable). Does this
>>         mean I am not using the associated PtAP functions?
>>
>>
>>     No - not necessarily. All it means is the options were not parsed. 
>>
>>     If your matrices have an option prefix associated with them (e.g.
>>     abc) , then you need to provide the option as
>>       -abc_matptap_via scalable
>>
>>     If you are not sure if you matrices have a prefix, look at the
>>     result of -ksp_view (see below for an example)
>>
>>       Mat Object: 2 MPI processes
>>
>>         type: mpiaij
>>
>>         rows=363, cols=363, bs=3
>>
>>         total: nonzeros=8649, allocated nonzeros=8649
>>
>>         total number of mallocs used during MatSetValues calls =0
>>
>>       Mat Object: (B_) 2 MPI processes
>>
>>         type: mpiaij
>>
>>         rows=363, cols=363, bs=3
>>
>>         total: nonzeros=8649, allocated nonzeros=8649
>>
>>         total number of mallocs used during MatSetValues calls =0
>>
>>
>>     The first matrix has no options prefix, but the second does and
>>     it's called "B_".
>>
>>
>>
>>      
>>
>>         Myriam
>>
>>
>>         Le 03/26/19 à 11:10, Dave May a écrit :
>>>
>>>         On Tue, 26 Mar 2019 at 09:52, Myriam Peyrounette via
>>>         petsc-users <petsc-users at mcs.anl.gov
>>>         <mailto:petsc-users at mcs.anl.gov>> wrote:
>>>
>>>             How can I be sure they are indeed used? Can I print this
>>>             information in some log file?
>>>
>>>         Yes. Re-run the job with the command line option
>>>
>>>         -options_left true
>>>
>>>         This will report all options parsed, and importantly, will
>>>         also indicate if any options were unused.
>>>          
>>>
>>>         Thanks
>>>         Dave
>>>
>>>             Thanks in advance
>>>
>>>             Myriam
>>>
>>>
>>>             Le 03/25/19 à 18:24, Matthew Knepley a écrit :
>>>>             On Mon, Mar 25, 2019 at 10:54 AM Myriam Peyrounette via
>>>>             petsc-users <petsc-users at mcs.anl.gov
>>>>             <mailto:petsc-users at mcs.anl.gov>> wrote:
>>>>
>>>>                 Hi,
>>>>
>>>>                 thanks for the explanations. I tried the last PETSc
>>>>                 version (commit
>>>>                 fbc5705bc518d02a4999f188aad4ccff5f754cbf), which
>>>>                 includes the patch you talked about. But the memory
>>>>                 scaling shows no improvement (see scaling
>>>>                 attached), even when using the "scalable" options :(
>>>>
>>>>                 I had a look at the PETSc functions
>>>>                 MatPtAPNumeric_MPIAIJ_MPIAIJ and
>>>>                 MatPtAPSymbolic_MPIAIJ_MPIAIJ (especially at the
>>>>                 differences before and after the first "bad"
>>>>                 commit), but I can't find what induced this memory
>>>>                 issue.
>>>>
>>>>             Are you sure that the option was used? It just looks
>>>>             suspicious to me that they use exactly the same amount
>>>>             of memory. It should be different, even if it does not
>>>>             solve the problem.
>>>>
>>>>                Thanks,
>>>>
>>>>                  Matt 
>>>>
>>>>                 Myriam
>>>>
>>>>
>>>>
>>>>
>>>>                 Le 03/20/19 à 17:38, Fande Kong a écrit :
>>>>>                 Hi Myriam,
>>>>>
>>>>>                 There are three algorithms in PETSc to do PtAP
>>>>>                 ( const char          *algTypes[3] =
>>>>>                 {"scalable","nonscalable","hypre"};), and can be
>>>>>                 specified using the petsc options: -matptap_via xxxx.
>>>>>
>>>>>                 (1) -matptap_via hypre: This call the hypre
>>>>>                 package to do the PtAP trough an all-at-once
>>>>>                 triple product. In our experiences, it is the most
>>>>>                 memory efficient, but could be slow.
>>>>>
>>>>>                 (2)  -matptap_via scalable: This involves a
>>>>>                 row-wise algorithm plus an outer product.  This
>>>>>                 will use more memory than hypre, but way faster.
>>>>>                 This used to have a bug that could take all your
>>>>>                 memory, and I have a fix
>>>>>                 at https://bitbucket.org/petsc/petsc/pull-requests/1452/mpiptap-enable-large-scale-simulations/diff. 
>>>>>                 When using this option, we may want to have extra
>>>>>                 options such as   -inner_offdiag_matmatmult_via
>>>>>                 scalable -inner_diag_matmatmult_via scalable  to
>>>>>                 select inner scalable algorithms.
>>>>>
>>>>>                 (3)  -matptap_via nonscalable:  Suppose to be even
>>>>>                 faster, but use more memory. It does dense matrix
>>>>>                 operations.
>>>>>
>>>>>
>>>>>                 Thanks,
>>>>>
>>>>>                 Fande Kong
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>                 On Wed, Mar 20, 2019 at 10:06 AM Myriam
>>>>>                 Peyrounette via petsc-users
>>>>>                 <petsc-users at mcs.anl.gov
>>>>>                 <mailto:petsc-users at mcs.anl.gov>> wrote:
>>>>>
>>>>>                     More precisely: something happens when
>>>>>                     upgrading the functions
>>>>>                     MatPtAPNumeric_MPIAIJ_MPIAIJ and/or
>>>>>                     MatPtAPSymbolic_MPIAIJ_MPIAIJ.
>>>>>
>>>>>                     Unfortunately, there are a lot of differences
>>>>>                     between the old and new versions of these
>>>>>                     functions. I keep investigating but if you
>>>>>                     have any idea, please let me know.
>>>>>
>>>>>                     Best,
>>>>>
>>>>>                     Myriam
>>>>>
>>>>>
>>>>>                     Le 03/20/19 à 13:48, Myriam Peyrounette a écrit :
>>>>>>
>>>>>>                     Hi all,
>>>>>>
>>>>>>                     I used git bisect to determine when the
>>>>>>                     memory need increased. I found that the first
>>>>>>                     "bad" commit is  
>>>>>>                     aa690a28a7284adb519c28cb44eae20a2c131c85.
>>>>>>
>>>>>>                     Barry was right, this commit seems to be
>>>>>>                     about an evolution of
>>>>>>                     MatPtAPSymbolic_MPIAIJ_MPIAIJ. You mentioned
>>>>>>                     the option "-matptap_via scalable" but I
>>>>>>                     can't find any information about it. Can you
>>>>>>                     tell me more?
>>>>>>
>>>>>>                     Thanks
>>>>>>
>>>>>>                     Myriam
>>>>>>
>>>>>>
>>>>>>                     Le 03/11/19 à 14:40, Mark Adams a écrit :
>>>>>>>                     Is there a difference in memory usage on
>>>>>>>                     your tiny problem? I assume no.
>>>>>>>
>>>>>>>                     I don't see anything that could come from
>>>>>>>                     GAMG other than the RAP stuff that you have
>>>>>>>                     discussed already.
>>>>>>>
>>>>>>>                     On Mon, Mar 11, 2019 at 9:32 AM Myriam
>>>>>>>                     Peyrounette <myriam.peyrounette at idris.fr
>>>>>>>                     <mailto:myriam.peyrounette at idris.fr>> wrote:
>>>>>>>
>>>>>>>                         The code I am using here is the example
>>>>>>>                         42 of PETSc
>>>>>>>                         (https://www.mcs.anl.gov/petsc/petsc-3.9/src/ksp/ksp/examples/tutorials/ex42.c.html).
>>>>>>>                         Indeed it solves the Stokes equation. I
>>>>>>>                         thought it was a good idea to use an
>>>>>>>                         example you might know (and didn't find
>>>>>>>                         any that uses GAMG functions). I just
>>>>>>>                         changed the PCMG setup so that the
>>>>>>>                         memory problem appears. And it appears
>>>>>>>                         when adding PCGAMG.
>>>>>>>
>>>>>>>                         I don't care about the performance or
>>>>>>>                         even the result rightness here, but only
>>>>>>>                         about the difference in memory use
>>>>>>>                         between 3.6 and 3.10. Do you think
>>>>>>>                         finding a more adapted script would help?
>>>>>>>
>>>>>>>                         I used the threshold of 0.1 only once,
>>>>>>>                         at the beginning, to test its influence.
>>>>>>>                         I used the default threshold (of 0, I
>>>>>>>                         guess) for all the other runs.
>>>>>>>
>>>>>>>                         Myriam
>>>>>>>
>>>>>>>
>>>>>>>                         Le 03/11/19 à 13:52, Mark Adams a écrit :
>>>>>>>>                         In looking at this larger scale run ...
>>>>>>>>
>>>>>>>>                         * Your eigen estimates are much lower
>>>>>>>>                         than your tiny test problem.  But this
>>>>>>>>                         is Stokes apparently and it should not
>>>>>>>>                         work anyway. Maybe you have a small
>>>>>>>>                         time step that adds a lot of mass that
>>>>>>>>                         brings the eigen estimates down. And
>>>>>>>>                         your min eigenvalue (not used) is
>>>>>>>>                         positive. I would expect negative for
>>>>>>>>                         Stokes ...
>>>>>>>>
>>>>>>>>                         * You seem to be setting a threshold
>>>>>>>>                         value of 0.1 -- that is very high
>>>>>>>>
>>>>>>>>                         * v3.6 says "using nonzero initial
>>>>>>>>                         guess" but this is not in v3.10. Maybe
>>>>>>>>                         we just stopped printing that.
>>>>>>>>
>>>>>>>>                         * There were some changes to coasening
>>>>>>>>                         parameters in going from v3.6 but it
>>>>>>>>                         does not look like your problem was
>>>>>>>>                         effected. (The coarsening algo is
>>>>>>>>                         non-deterministic by default and you
>>>>>>>>                         can see small difference on different runs)
>>>>>>>>
>>>>>>>>                         * We may have also added a "noisy" RHS
>>>>>>>>                         for eigen estimates by default from v3.6.
>>>>>>>>
>>>>>>>>                         * And for non-symetric problems you can
>>>>>>>>                         try -pc_gamg_agg_nsmooths 0, but again
>>>>>>>>                         GAMG is not built for Stokes anyway.
>>>>>>>>
>>>>>>>>
>>>>>>>>                         On Tue, Mar 5, 2019 at 11:53 AM Myriam
>>>>>>>>                         Peyrounette
>>>>>>>>                         <myriam.peyrounette at idris.fr
>>>>>>>>                         <mailto:myriam.peyrounette at idris.fr>>
>>>>>>>>                         wrote:
>>>>>>>>
>>>>>>>>                             I used PCView to display the size
>>>>>>>>                             of the linear system in each level
>>>>>>>>                             of the MG. You'll find the outputs
>>>>>>>>                             attached to this mail (zip file)
>>>>>>>>                             for both the default threshold
>>>>>>>>                             value and a value of 0.1, and for
>>>>>>>>                             both 3.6 and 3.10 PETSc versions.
>>>>>>>>
>>>>>>>>                             For convenience, I summarized the
>>>>>>>>                             information in a graph, also
>>>>>>>>                             attached (png file).
>>>>>>>>
>>>>>>>>                             As you can see, there are slight
>>>>>>>>                             differences between the two
>>>>>>>>                             versions but none is critical, in
>>>>>>>>                             my opinion. Do you see anything
>>>>>>>>                             suspicious in the outputs?
>>>>>>>>
>>>>>>>>                             + I can't find the default
>>>>>>>>                             threshold value. Do you know where
>>>>>>>>                             I can find it?
>>>>>>>>
>>>>>>>>                             Thanks for the follow-up
>>>>>>>>
>>>>>>>>                             Myriam
>>>>>>>>
>>>>>>>>
>>>>>>>>                             Le 03/05/19 à 14:06, Matthew
>>>>>>>>                             Knepley a écrit :
>>>>>>>>>                             On Tue, Mar 5, 2019 at 7:14 AM
>>>>>>>>>                             Myriam Peyrounette
>>>>>>>>>                             <myriam.peyrounette at idris.fr
>>>>>>>>>                             <mailto:myriam.peyrounette at idris.fr>>
>>>>>>>>>                             wrote:
>>>>>>>>>
>>>>>>>>>                                 Hi Matt,
>>>>>>>>>
>>>>>>>>>                                 I plotted the memory scalings
>>>>>>>>>                                 using different threshold
>>>>>>>>>                                 values. The two scalings are
>>>>>>>>>                                 slightly translated (from -22
>>>>>>>>>                                 to -88 mB) but this gain is
>>>>>>>>>                                 neglectable. The 3.6-scaling
>>>>>>>>>                                 keeps being robust while the
>>>>>>>>>                                 3.10-scaling deteriorates.
>>>>>>>>>
>>>>>>>>>                                 Do you have any other suggestion?
>>>>>>>>>
>>>>>>>>>                             Mark, what is the option she can
>>>>>>>>>                             give to output all the GAMG data?
>>>>>>>>>
>>>>>>>>>                             Also, run using -ksp_view. GAMG
>>>>>>>>>                             will report all the sizes of its
>>>>>>>>>                             grids, so it should be easy to see
>>>>>>>>>                             if the coarse grid sizes are
>>>>>>>>>                             increasing, and also what the
>>>>>>>>>                             effect of the threshold value is.
>>>>>>>>>
>>>>>>>>>                               Thanks,
>>>>>>>>>
>>>>>>>>>                                  Matt 
>>>>>>>>>
>>>>>>>>>                                 Thanks
>>>>>>>>>
>>>>>>>>>                                 Myriam
>>>>>>>>>
>>>>>>>>>                                 Le 03/02/19 à 02:27, Matthew
>>>>>>>>>                                 Knepley a écrit :
>>>>>>>>>>                                 On Fri, Mar 1, 2019 at 10:53
>>>>>>>>>>                                 AM Myriam Peyrounette via
>>>>>>>>>>                                 petsc-users
>>>>>>>>>>                                 <petsc-users at mcs.anl.gov
>>>>>>>>>>                                 <mailto:petsc-users at mcs.anl.gov>>
>>>>>>>>>>                                 wrote:
>>>>>>>>>>
>>>>>>>>>>                                     Hi,
>>>>>>>>>>
>>>>>>>>>>                                     I used to run my code
>>>>>>>>>>                                     with PETSc 3.6. Since I
>>>>>>>>>>                                     upgraded the PETSc version
>>>>>>>>>>                                     to 3.10, this code has a
>>>>>>>>>>                                     bad memory scaling.
>>>>>>>>>>
>>>>>>>>>>                                     To report this issue, I
>>>>>>>>>>                                     took the PETSc script
>>>>>>>>>>                                     ex42.c and slightly
>>>>>>>>>>                                     modified it so that the
>>>>>>>>>>                                     KSP and PC configurations
>>>>>>>>>>                                     are the same as in my
>>>>>>>>>>                                     code. In particular, I
>>>>>>>>>>                                     use a "personnalised"
>>>>>>>>>>                                     multi-grid method. The
>>>>>>>>>>                                     modifications are
>>>>>>>>>>                                     indicated by the keyword
>>>>>>>>>>                                     "TopBridge" in the attached
>>>>>>>>>>                                     scripts.
>>>>>>>>>>
>>>>>>>>>>                                     To plot the memory (weak)
>>>>>>>>>>                                     scaling, I ran four
>>>>>>>>>>                                     calculations for each
>>>>>>>>>>                                     script with increasing
>>>>>>>>>>                                     problem sizes and
>>>>>>>>>>                                     computations cores:
>>>>>>>>>>
>>>>>>>>>>                                     1. 100,000 elts on 4 cores
>>>>>>>>>>                                     2. 1 million elts on 40 cores
>>>>>>>>>>                                     3. 10 millions elts on
>>>>>>>>>>                                     400 cores
>>>>>>>>>>                                     4. 100 millions elts on
>>>>>>>>>>                                     4,000 cores
>>>>>>>>>>
>>>>>>>>>>                                     The resulting graph is
>>>>>>>>>>                                     also attached. The
>>>>>>>>>>                                     scaling using PETSc 3.10
>>>>>>>>>>                                     clearly deteriorates for
>>>>>>>>>>                                     large cases, while the
>>>>>>>>>>                                     one using PETSc 3.6 is
>>>>>>>>>>                                     robust.
>>>>>>>>>>
>>>>>>>>>>                                     After a few tests, I
>>>>>>>>>>                                     found that the scaling is
>>>>>>>>>>                                     mostly sensitive to the
>>>>>>>>>>                                     use of the AMG method for
>>>>>>>>>>                                     the coarse grid (line 1780 in
>>>>>>>>>>                                     main_ex42_petsc36.cc). In
>>>>>>>>>>                                     particular, the
>>>>>>>>>>                                     performance strongly
>>>>>>>>>>                                     deteriorates when
>>>>>>>>>>                                     commenting lines 1777 to
>>>>>>>>>>                                     1790 (in
>>>>>>>>>>                                     main_ex42_petsc36.cc).
>>>>>>>>>>
>>>>>>>>>>                                     Do you have any idea of
>>>>>>>>>>                                     what changed between
>>>>>>>>>>                                     version 3.6 and version
>>>>>>>>>>                                     3.10 that may imply such
>>>>>>>>>>                                     degradation?
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>                                 I believe the default values
>>>>>>>>>>                                 for PCGAMG changed between
>>>>>>>>>>                                 versions. It sounds like the
>>>>>>>>>>                                 coarsening rate
>>>>>>>>>>                                 is not great enough, so that
>>>>>>>>>>                                 these grids are too large.
>>>>>>>>>>                                 This can be set using:
>>>>>>>>>>
>>>>>>>>>>                                   https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCGAMGSetThreshold.html
>>>>>>>>>>
>>>>>>>>>>                                 There is some explanation of
>>>>>>>>>>                                 this effect on that page. Let
>>>>>>>>>>                                 us know if setting this does
>>>>>>>>>>                                 not correct the situation.
>>>>>>>>>>
>>>>>>>>>>                                   Thanks,
>>>>>>>>>>
>>>>>>>>>>                                      Matt
>>>>>>>>>>                                  
>>>>>>>>>>
>>>>>>>>>>                                     Let me know if you need
>>>>>>>>>>                                     further information.
>>>>>>>>>>
>>>>>>>>>>                                     Best,
>>>>>>>>>>
>>>>>>>>>>                                     Myriam Peyrounette
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>                                     -- 
>>>>>>>>>>                                     Myriam Peyrounette
>>>>>>>>>>                                     CNRS/IDRIS - HLST
>>>>>>>>>>                                     --
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>                                 -- 
>>>>>>>>>>                                 What most experimenters take
>>>>>>>>>>                                 for granted before they begin
>>>>>>>>>>                                 their experiments is
>>>>>>>>>>                                 infinitely more interesting
>>>>>>>>>>                                 than any results to which
>>>>>>>>>>                                 their experiments lead.
>>>>>>>>>>                                 -- Norbert Wiener
>>>>>>>>>>
>>>>>>>>>>                                 https://www.cse.buffalo.edu/~knepley/
>>>>>>>>>>                                 <http://www.cse.buffalo.edu/%7Eknepley/>
>>>>>>>>>
>>>>>>>>>                                 -- 
>>>>>>>>>                                 Myriam Peyrounette
>>>>>>>>>                                 CNRS/IDRIS - HLST
>>>>>>>>>                                 --
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>                             -- 
>>>>>>>>>                             What most experimenters take for
>>>>>>>>>                             granted before they begin their
>>>>>>>>>                             experiments is infinitely more
>>>>>>>>>                             interesting than any results to
>>>>>>>>>                             which their experiments lead.
>>>>>>>>>                             -- Norbert Wiener
>>>>>>>>>
>>>>>>>>>                             https://www.cse.buffalo.edu/~knepley/
>>>>>>>>>                             <http://www.cse.buffalo.edu/%7Eknepley/>
>>>>>>>>
>>>>>>>>                             -- 
>>>>>>>>                             Myriam Peyrounette
>>>>>>>>                             CNRS/IDRIS - HLST
>>>>>>>>                             --
>>>>>>>>
>>>>>>>
>>>>>>>                         -- 
>>>>>>>                         Myriam Peyrounette
>>>>>>>                         CNRS/IDRIS - HLST
>>>>>>>                         --
>>>>>>>
>>>>>>
>>>>>>                     -- 
>>>>>>                     Myriam Peyrounette
>>>>>>                     CNRS/IDRIS - HLST
>>>>>>                     --
>>>>>
>>>>>                     -- 
>>>>>                     Myriam Peyrounette
>>>>>                     CNRS/IDRIS - HLST
>>>>>                     --
>>>>>
>>>>
>>>>                 -- 
>>>>                 Myriam Peyrounette
>>>>                 CNRS/IDRIS - HLST
>>>>                 --
>>>>
>>>>
>>>>
>>>>             -- 
>>>>             What most experimenters take for granted before they
>>>>             begin their experiments is infinitely more interesting
>>>>             than any results to which their experiments lead.
>>>>             -- Norbert Wiener
>>>>
>>>>             https://www.cse.buffalo.edu/~knepley/
>>>>             <http://www.cse.buffalo.edu/%7Eknepley/>
>>>
>>>             -- 
>>>             Myriam Peyrounette
>>>             CNRS/IDRIS - HLST
>>>             --
>>>
>>
>>         -- 
>>         Myriam Peyrounette
>>         CNRS/IDRIS - HLST
>>         --
>>
>
>     -- 
>     Myriam Peyrounette
>     CNRS/IDRIS - HLST
>     --
>
>
>
> -- 
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which
> their experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> <http://www.cse.buffalo.edu/%7Eknepley/>

-- 
Myriam Peyrounette
CNRS/IDRIS - HLST
--

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20190326/ebb6e305/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2975 bytes
Desc: Signature cryptographique S/MIME
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20190326/ebb6e305/attachment-0001.p7s>