[petsc-users] Bad memory scaling with PETSc 3.10

Tue Mar 26 04:52:37 CDT 2019

How can I be sure they are indeed used? Can I print this information in
some log file?

Thanks in advance

Myriam

Le 03/25/19 à 18:24, Matthew Knepley a écrit :
> On Mon, Mar 25, 2019 at 10:54 AM Myriam Peyrounette via petsc-users
> <petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov>> wrote:
>
>     Hi,
>
>     thanks for the explanations. I tried the last PETSc version
>     (commit fbc5705bc518d02a4999f188aad4ccff5f754cbf), which includes
>     the patch you talked about. But the memory scaling shows no
>     improvement (see scaling attached), even when using the "scalable"
>     options :(
>
>     I had a look at the PETSc functions MatPtAPNumeric_MPIAIJ_MPIAIJ
>     and MatPtAPSymbolic_MPIAIJ_MPIAIJ (especially at the differences
>     before and after the first "bad" commit), but I can't find what
>     induced this memory issue.
>
> Are you sure that the option was used? It just looks suspicious to me
> that they use exactly the same amount of memory. It should be
> different, even if it does not solve the problem.
>
>    Thanks,
>
>      Matt 
>
>     Myriam
>
>
>
>
>     Le 03/20/19 à 17:38, Fande Kong a écrit :
>>     Hi Myriam,
>>
>>     There are three algorithms in PETSc to do PtAP ( const char     
>>         *algTypes[3] = {"scalable","nonscalable","hypre"};), and can
>>     be specified using the petsc options: -matptap_via xxxx.
>>
>>     (1) -matptap_via hypre: This call the hypre package to do the
>>     PtAP trough an all-at-once triple product. In our experiences, it
>>     is the most memory efficient, but could be slow.
>>
>>     (2)  -matptap_via scalable: This involves a row-wise algorithm
>>     plus an outer product.  This will use more memory than hypre, but
>>     way faster. This used to have a bug that could take all your
>>     memory, and I have a fix
>>     at https://bitbucket.org/petsc/petsc/pull-requests/1452/mpiptap-enable-large-scale-simulations/diff. 
>>     When using this option, we may want to have extra options such
>>     as   -inner_offdiag_matmatmult_via scalable
>>     -inner_diag_matmatmult_via scalable  to select inner scalable
>>     algorithms.
>>
>>     (3)  -matptap_via nonscalable:  Suppose to be even faster, but
>>     use more memory. It does dense matrix operations.
>>
>>
>>     Thanks,
>>
>>     Fande Kong
>>
>>
>>
>>
>>     On Wed, Mar 20, 2019 at 10:06 AM Myriam Peyrounette via
>>     petsc-users <petsc-users at mcs.anl.gov
>>     <mailto:petsc-users at mcs.anl.gov>> wrote:
>>
>>         More precisely: something happens when upgrading the
>>         functions MatPtAPNumeric_MPIAIJ_MPIAIJ and/or
>>         MatPtAPSymbolic_MPIAIJ_MPIAIJ.
>>
>>         Unfortunately, there are a lot of differences between the old
>>         and new versions of these functions. I keep investigating but
>>         if you have any idea, please let me know.
>>
>>         Best,
>>
>>         Myriam
>>
>>
>>         Le 03/20/19 à 13:48, Myriam Peyrounette a écrit :
>>>
>>>         Hi all,
>>>
>>>         I used git bisect to determine when the memory need
>>>         increased. I found that the first "bad" commit is  
>>>         aa690a28a7284adb519c28cb44eae20a2c131c85.
>>>
>>>         Barry was right, this commit seems to be about an evolution
>>>         of MatPtAPSymbolic_MPIAIJ_MPIAIJ. You mentioned the option
>>>         "-matptap_via scalable" but I can't find any information
>>>         about it. Can you tell me more?
>>>
>>>         Thanks
>>>
>>>         Myriam
>>>
>>>
>>>         Le 03/11/19 à 14:40, Mark Adams a écrit :
>>>>         Is there a difference in memory usage on your tiny problem?
>>>>         I assume no.
>>>>
>>>>         I don't see anything that could come from GAMG other than
>>>>         the RAP stuff that you have discussed already.
>>>>
>>>>         On Mon, Mar 11, 2019 at 9:32 AM Myriam Peyrounette
>>>>         <myriam.peyrounette at idris.fr
>>>>         <mailto:myriam.peyrounette at idris.fr>> wrote:
>>>>
>>>>             The code I am using here is the example 42 of PETSc
>>>>             (https://www.mcs.anl.gov/petsc/petsc-3.9/src/ksp/ksp/examples/tutorials/ex42.c.html).
>>>>             Indeed it solves the Stokes equation. I thought it was
>>>>             a good idea to use an example you might know (and
>>>>             didn't find any that uses GAMG functions). I just
>>>>             changed the PCMG setup so that the memory problem
>>>>             appears. And it appears when adding PCGAMG.
>>>>
>>>>             I don't care about the performance or even the result
>>>>             rightness here, but only about the difference in memory
>>>>             use between 3.6 and 3.10. Do you think finding a more
>>>>             adapted script would help?
>>>>
>>>>             I used the threshold of 0.1 only once, at the
>>>>             beginning, to test its influence. I used the default
>>>>             threshold (of 0, I guess) for all the other runs.
>>>>
>>>>             Myriam
>>>>
>>>>
>>>>             Le 03/11/19 à 13:52, Mark Adams a écrit :
>>>>>             In looking at this larger scale run ...
>>>>>
>>>>>             * Your eigen estimates are much lower than your tiny
>>>>>             test problem.  But this is Stokes apparently and it
>>>>>             should not work anyway. Maybe you have a small time
>>>>>             step that adds a lot of mass that brings the eigen
>>>>>             estimates down. And your min eigenvalue (not used) is
>>>>>             positive. I would expect negative for Stokes ...
>>>>>
>>>>>             * You seem to be setting a threshold value of 0.1 --
>>>>>             that is very high
>>>>>
>>>>>             * v3.6 says "using nonzero initial guess" but this is
>>>>>             not in v3.10. Maybe we just stopped printing that.
>>>>>
>>>>>             * There were some changes to coasening parameters in
>>>>>             going from v3.6 but it does not look like your problem
>>>>>             was effected. (The coarsening algo is
>>>>>             non-deterministic by default and you can see small
>>>>>             difference on different runs)
>>>>>
>>>>>             * We may have also added a "noisy" RHS for eigen
>>>>>             estimates by default from v3.6.
>>>>>
>>>>>             * And for non-symetric problems you can try
>>>>>             -pc_gamg_agg_nsmooths 0, but again GAMG is not built
>>>>>             for Stokes anyway.
>>>>>
>>>>>
>>>>>             On Tue, Mar 5, 2019 at 11:53 AM Myriam Peyrounette
>>>>>             <myriam.peyrounette at idris.fr
>>>>>             <mailto:myriam.peyrounette at idris.fr>> wrote:
>>>>>
>>>>>                 I used PCView to display the size of the linear
>>>>>                 system in each level of the MG. You'll find the
>>>>>                 outputs attached to this mail (zip file) for both
>>>>>                 the default threshold value and a value of 0.1,
>>>>>                 and for both 3.6 and 3.10 PETSc versions.
>>>>>
>>>>>                 For convenience, I summarized the information in a
>>>>>                 graph, also attached (png file).
>>>>>
>>>>>                 As you can see, there are slight differences
>>>>>                 between the two versions but none is critical, in
>>>>>                 my opinion. Do you see anything suspicious in the
>>>>>                 outputs?
>>>>>
>>>>>                 + I can't find the default threshold value. Do you
>>>>>                 know where I can find it?
>>>>>
>>>>>                 Thanks for the follow-up
>>>>>
>>>>>                 Myriam
>>>>>
>>>>>
>>>>>                 Le 03/05/19 à 14:06, Matthew Knepley a écrit :
>>>>>>                 On Tue, Mar 5, 2019 at 7:14 AM Myriam Peyrounette
>>>>>>                 <myriam.peyrounette at idris.fr
>>>>>>                 <mailto:myriam.peyrounette at idris.fr>> wrote:
>>>>>>
>>>>>>                     Hi Matt,
>>>>>>
>>>>>>                     I plotted the memory scalings using different
>>>>>>                     threshold values. The two scalings are
>>>>>>                     slightly translated (from -22 to -88 mB) but
>>>>>>                     this gain is neglectable. The 3.6-scaling
>>>>>>                     keeps being robust while the 3.10-scaling
>>>>>>                     deteriorates.
>>>>>>
>>>>>>                     Do you have any other suggestion?
>>>>>>
>>>>>>                 Mark, what is the option she can give to output
>>>>>>                 all the GAMG data?
>>>>>>
>>>>>>                 Also, run using -ksp_view. GAMG will report all
>>>>>>                 the sizes of its grids, so it should be easy to see
>>>>>>                 if the coarse grid sizes are increasing, and also
>>>>>>                 what the effect of the threshold value is.
>>>>>>
>>>>>>                   Thanks,
>>>>>>
>>>>>>                      Matt 
>>>>>>
>>>>>>                     Thanks
>>>>>>
>>>>>>                     Myriam
>>>>>>
>>>>>>                     Le 03/02/19 à 02:27, Matthew Knepley a écrit :
>>>>>>>                     On Fri, Mar 1, 2019 at 10:53 AM Myriam
>>>>>>>                     Peyrounette via petsc-users
>>>>>>>                     <petsc-users at mcs.anl.gov
>>>>>>>                     <mailto:petsc-users at mcs.anl.gov>> wrote:
>>>>>>>
>>>>>>>                         Hi,
>>>>>>>
>>>>>>>                         I used to run my code with PETSc 3.6.
>>>>>>>                         Since I upgraded the PETSc version
>>>>>>>                         to 3.10, this code has a bad memory scaling.
>>>>>>>
>>>>>>>                         To report this issue, I took the PETSc
>>>>>>>                         script ex42.c and slightly
>>>>>>>                         modified it so that the KSP and PC
>>>>>>>                         configurations are the same as in my
>>>>>>>                         code. In particular, I use a
>>>>>>>                         "personnalised" multi-grid method. The
>>>>>>>                         modifications are indicated by the
>>>>>>>                         keyword "TopBridge" in the attached
>>>>>>>                         scripts.
>>>>>>>
>>>>>>>                         To plot the memory (weak) scaling, I ran
>>>>>>>                         four calculations for each
>>>>>>>                         script with increasing problem sizes and
>>>>>>>                         computations cores:
>>>>>>>
>>>>>>>                         1. 100,000 elts on 4 cores
>>>>>>>                         2. 1 million elts on 40 cores
>>>>>>>                         3. 10 millions elts on 400 cores
>>>>>>>                         4. 100 millions elts on 4,000 cores
>>>>>>>
>>>>>>>                         The resulting graph is also attached.
>>>>>>>                         The scaling using PETSc 3.10
>>>>>>>                         clearly deteriorates for large cases,
>>>>>>>                         while the one using PETSc 3.6 is
>>>>>>>                         robust.
>>>>>>>
>>>>>>>                         After a few tests, I found that the
>>>>>>>                         scaling is mostly sensitive to the
>>>>>>>                         use of the AMG method for the coarse
>>>>>>>                         grid (line 1780 in
>>>>>>>                         main_ex42_petsc36.cc). In particular,
>>>>>>>                         the performance strongly
>>>>>>>                         deteriorates when commenting lines 1777
>>>>>>>                         to 1790 (in main_ex42_petsc36.cc).
>>>>>>>
>>>>>>>                         Do you have any idea of what changed
>>>>>>>                         between version 3.6 and version
>>>>>>>                         3.10 that may imply such degradation?
>>>>>>>
>>>>>>>
>>>>>>>                     I believe the default values for PCGAMG
>>>>>>>                     changed between versions. It sounds like the
>>>>>>>                     coarsening rate
>>>>>>>                     is not great enough, so that these grids are
>>>>>>>                     too large. This can be set using:
>>>>>>>
>>>>>>>                       https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCGAMGSetThreshold.html
>>>>>>>
>>>>>>>                     There is some explanation of this effect on
>>>>>>>                     that page. Let us know if setting this does
>>>>>>>                     not correct the situation.
>>>>>>>
>>>>>>>                       Thanks,
>>>>>>>
>>>>>>>                          Matt
>>>>>>>                      
>>>>>>>
>>>>>>>                         Let me know if you need further information.
>>>>>>>
>>>>>>>                         Best,
>>>>>>>
>>>>>>>                         Myriam Peyrounette
>>>>>>>
>>>>>>>
>>>>>>>                         -- 
>>>>>>>                         Myriam Peyrounette
>>>>>>>                         CNRS/IDRIS - HLST
>>>>>>>                         --
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>                     -- 
>>>>>>>                     What most experimenters take for granted
>>>>>>>                     before they begin their experiments is
>>>>>>>                     infinitely more interesting than any results
>>>>>>>                     to which their experiments lead.
>>>>>>>                     -- Norbert Wiener
>>>>>>>
>>>>>>>                     https://www.cse.buffalo.edu/~knepley/
>>>>>>>                     <http://www.cse.buffalo.edu/%7Eknepley/>
>>>>>>
>>>>>>                     -- 
>>>>>>                     Myriam Peyrounette
>>>>>>                     CNRS/IDRIS - HLST
>>>>>>                     --
>>>>>>
>>>>>>
>>>>>>
>>>>>>                 -- 
>>>>>>                 What most experimenters take for granted before
>>>>>>                 they begin their experiments is infinitely more
>>>>>>                 interesting than any results to which their
>>>>>>                 experiments lead.
>>>>>>                 -- Norbert Wiener
>>>>>>
>>>>>>                 https://www.cse.buffalo.edu/~knepley/
>>>>>>                 <http://www.cse.buffalo.edu/%7Eknepley/>
>>>>>
>>>>>                 -- 
>>>>>                 Myriam Peyrounette
>>>>>                 CNRS/IDRIS - HLST
>>>>>                 --
>>>>>
>>>>
>>>>             -- 
>>>>             Myriam Peyrounette
>>>>             CNRS/IDRIS - HLST
>>>>             --
>>>>
>>>
>>>         -- 
>>>         Myriam Peyrounette
>>>         CNRS/IDRIS - HLST
>>>         --
>>
>>         -- 
>>         Myriam Peyrounette
>>         CNRS/IDRIS - HLST
>>         --
>>
>
>     -- 
>     Myriam Peyrounette
>     CNRS/IDRIS - HLST
>     --
>
>
>
> -- 
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which
> their experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> <http://www.cse.buffalo.edu/%7Eknepley/>

-- 
Myriam Peyrounette
CNRS/IDRIS - HLST
--

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20190326/fe8166df/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2975 bytes
Desc: Signature cryptographique S/MIME
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20190326/fe8166df/attachment-0001.p7s>