[petsc-users] Bad memory scaling with PETSc 3.10

Tue Mar 26 05:35:56 CDT 2019

Oh you were right, the three options are unsused (-matptap_via scalable,
-inner_offdiag_matmatmult_via scalable and -inner_diag_matmatmult_via
scalable). Does this mean I am not using the associated PtAP functions?

Myriam


Le 03/26/19 à 11:10, Dave May a écrit :
>
> On Tue, 26 Mar 2019 at 09:52, Myriam Peyrounette via petsc-users
> <petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov>> wrote:
>
>     How can I be sure they are indeed used? Can I print this
>     information in some log file?
>
> Yes. Re-run the job with the command line option
>
> -options_left true
>
> This will report all options parsed, and importantly, will also
> indicate if any options were unused.
>  
>
> Thanks
> Dave
>
>     Thanks in advance
>
>     Myriam
>
>
>     Le 03/25/19 à 18:24, Matthew Knepley a écrit :
>>     On Mon, Mar 25, 2019 at 10:54 AM Myriam Peyrounette via
>>     petsc-users <petsc-users at mcs.anl.gov
>>     <mailto:petsc-users at mcs.anl.gov>> wrote:
>>
>>         Hi,
>>
>>         thanks for the explanations. I tried the last PETSc version
>>         (commit fbc5705bc518d02a4999f188aad4ccff5f754cbf), which
>>         includes the patch you talked about. But the memory scaling
>>         shows no improvement (see scaling attached), even when using
>>         the "scalable" options :(
>>
>>         I had a look at the PETSc functions
>>         MatPtAPNumeric_MPIAIJ_MPIAIJ and
>>         MatPtAPSymbolic_MPIAIJ_MPIAIJ (especially at the differences
>>         before and after the first "bad" commit), but I can't find
>>         what induced this memory issue.
>>
>>     Are you sure that the option was used? It just looks suspicious
>>     to me that they use exactly the same amount of memory. It should
>>     be different, even if it does not solve the problem.
>>
>>        Thanks,
>>
>>          Matt 
>>
>>         Myriam
>>
>>
>>
>>
>>         Le 03/20/19 à 17:38, Fande Kong a écrit :
>>>         Hi Myriam,
>>>
>>>         There are three algorithms in PETSc to do PtAP ( const char 
>>>                 *algTypes[3] = {"scalable","nonscalable","hypre"};),
>>>         and can be specified using the petsc options: -matptap_via xxxx.
>>>
>>>         (1) -matptap_via hypre: This call the hypre package to do
>>>         the PtAP trough an all-at-once triple product. In our
>>>         experiences, it is the most memory efficient, but could be slow.
>>>
>>>         (2)  -matptap_via scalable: This involves a row-wise
>>>         algorithm plus an outer product.  This will use more memory
>>>         than hypre, but way faster. This used to have a bug that
>>>         could take all your memory, and I have a fix
>>>         at https://bitbucket.org/petsc/petsc/pull-requests/1452/mpiptap-enable-large-scale-simulations/diff. 
>>>         When using this option, we may want to have extra options
>>>         such as   -inner_offdiag_matmatmult_via scalable
>>>         -inner_diag_matmatmult_via scalable  to select inner
>>>         scalable algorithms.
>>>
>>>         (3)  -matptap_via nonscalable:  Suppose to be even faster,
>>>         but use more memory. It does dense matrix operations.
>>>
>>>
>>>         Thanks,
>>>
>>>         Fande Kong
>>>
>>>
>>>
>>>
>>>         On Wed, Mar 20, 2019 at 10:06 AM Myriam Peyrounette via
>>>         petsc-users <petsc-users at mcs.anl.gov
>>>         <mailto:petsc-users at mcs.anl.gov>> wrote:
>>>
>>>             More precisely: something happens when upgrading the
>>>             functions MatPtAPNumeric_MPIAIJ_MPIAIJ and/or
>>>             MatPtAPSymbolic_MPIAIJ_MPIAIJ.
>>>
>>>             Unfortunately, there are a lot of differences between
>>>             the old and new versions of these functions. I keep
>>>             investigating but if you have any idea, please let me know.
>>>
>>>             Best,
>>>
>>>             Myriam
>>>
>>>
>>>             Le 03/20/19 à 13:48, Myriam Peyrounette a écrit :
>>>>
>>>>             Hi all,
>>>>
>>>>             I used git bisect to determine when the memory need
>>>>             increased. I found that the first "bad" commit is  
>>>>             aa690a28a7284adb519c28cb44eae20a2c131c85.
>>>>
>>>>             Barry was right, this commit seems to be about an
>>>>             evolution of MatPtAPSymbolic_MPIAIJ_MPIAIJ. You
>>>>             mentioned the option "-matptap_via scalable" but I
>>>>             can't find any information about it. Can you tell me more?
>>>>
>>>>             Thanks
>>>>
>>>>             Myriam
>>>>
>>>>
>>>>             Le 03/11/19 à 14:40, Mark Adams a écrit :
>>>>>             Is there a difference in memory usage on your tiny
>>>>>             problem? I assume no.
>>>>>
>>>>>             I don't see anything that could come from GAMG other
>>>>>             than the RAP stuff that you have discussed already.
>>>>>
>>>>>             On Mon, Mar 11, 2019 at 9:32 AM Myriam Peyrounette
>>>>>             <myriam.peyrounette at idris.fr
>>>>>             <mailto:myriam.peyrounette at idris.fr>> wrote:
>>>>>
>>>>>                 The code I am using here is the example 42 of
>>>>>                 PETSc
>>>>>                 (https://www.mcs.anl.gov/petsc/petsc-3.9/src/ksp/ksp/examples/tutorials/ex42.c.html).
>>>>>                 Indeed it solves the Stokes equation. I thought it
>>>>>                 was a good idea to use an example you might know
>>>>>                 (and didn't find any that uses GAMG functions). I
>>>>>                 just changed the PCMG setup so that the memory
>>>>>                 problem appears. And it appears when adding PCGAMG.
>>>>>
>>>>>                 I don't care about the performance or even the
>>>>>                 result rightness here, but only about the
>>>>>                 difference in memory use between 3.6 and 3.10. Do
>>>>>                 you think finding a more adapted script would help?
>>>>>
>>>>>                 I used the threshold of 0.1 only once, at the
>>>>>                 beginning, to test its influence. I used the
>>>>>                 default threshold (of 0, I guess) for all the
>>>>>                 other runs.
>>>>>
>>>>>                 Myriam
>>>>>
>>>>>
>>>>>                 Le 03/11/19 à 13:52, Mark Adams a écrit :
>>>>>>                 In looking at this larger scale run ...
>>>>>>
>>>>>>                 * Your eigen estimates are much lower than your
>>>>>>                 tiny test problem.  But this is Stokes apparently
>>>>>>                 and it should not work anyway. Maybe you have a
>>>>>>                 small time step that adds a lot of mass that
>>>>>>                 brings the eigen estimates down. And your min
>>>>>>                 eigenvalue (not used) is positive. I would expect
>>>>>>                 negative for Stokes ...
>>>>>>
>>>>>>                 * You seem to be setting a threshold value of 0.1
>>>>>>                 -- that is very high
>>>>>>
>>>>>>                 * v3.6 says "using nonzero initial guess" but
>>>>>>                 this is not in v3.10. Maybe we just stopped
>>>>>>                 printing that.
>>>>>>
>>>>>>                 * There were some changes to coasening parameters
>>>>>>                 in going from v3.6 but it does not look like your
>>>>>>                 problem was effected. (The coarsening algo is
>>>>>>                 non-deterministic by default and you can see
>>>>>>                 small difference on different runs)
>>>>>>
>>>>>>                 * We may have also added a "noisy" RHS for eigen
>>>>>>                 estimates by default from v3.6.
>>>>>>
>>>>>>                 * And for non-symetric problems you can try
>>>>>>                 -pc_gamg_agg_nsmooths 0, but again GAMG is not
>>>>>>                 built for Stokes anyway.
>>>>>>
>>>>>>
>>>>>>                 On Tue, Mar 5, 2019 at 11:53 AM Myriam
>>>>>>                 Peyrounette <myriam.peyrounette at idris.fr
>>>>>>                 <mailto:myriam.peyrounette at idris.fr>> wrote:
>>>>>>
>>>>>>                     I used PCView to display the size of the
>>>>>>                     linear system in each level of the MG. You'll
>>>>>>                     find the outputs attached to this mail (zip
>>>>>>                     file) for both the default threshold value
>>>>>>                     and a value of 0.1, and for both 3.6 and 3.10
>>>>>>                     PETSc versions.
>>>>>>
>>>>>>                     For convenience, I summarized the information
>>>>>>                     in a graph, also attached (png file).
>>>>>>
>>>>>>                     As you can see, there are slight differences
>>>>>>                     between the two versions but none is
>>>>>>                     critical, in my opinion. Do you see anything
>>>>>>                     suspicious in the outputs?
>>>>>>
>>>>>>                     + I can't find the default threshold value.
>>>>>>                     Do you know where I can find it?
>>>>>>
>>>>>>                     Thanks for the follow-up
>>>>>>
>>>>>>                     Myriam
>>>>>>
>>>>>>
>>>>>>                     Le 03/05/19 à 14:06, Matthew Knepley a écrit :
>>>>>>>                     On Tue, Mar 5, 2019 at 7:14 AM Myriam
>>>>>>>                     Peyrounette <myriam.peyrounette at idris.fr
>>>>>>>                     <mailto:myriam.peyrounette at idris.fr>> wrote:
>>>>>>>
>>>>>>>                         Hi Matt,
>>>>>>>
>>>>>>>                         I plotted the memory scalings using
>>>>>>>                         different threshold values. The two
>>>>>>>                         scalings are slightly translated (from
>>>>>>>                         -22 to -88 mB) but this gain is
>>>>>>>                         neglectable. The 3.6-scaling keeps being
>>>>>>>                         robust while the 3.10-scaling deteriorates.
>>>>>>>
>>>>>>>                         Do you have any other suggestion?
>>>>>>>
>>>>>>>                     Mark, what is the option she can give to
>>>>>>>                     output all the GAMG data?
>>>>>>>
>>>>>>>                     Also, run using -ksp_view. GAMG will report
>>>>>>>                     all the sizes of its grids, so it should be
>>>>>>>                     easy to see
>>>>>>>                     if the coarse grid sizes are increasing, and
>>>>>>>                     also what the effect of the threshold value is.
>>>>>>>
>>>>>>>                       Thanks,
>>>>>>>
>>>>>>>                          Matt 
>>>>>>>
>>>>>>>                         Thanks
>>>>>>>
>>>>>>>                         Myriam
>>>>>>>
>>>>>>>                         Le 03/02/19 à 02:27, Matthew Knepley a
>>>>>>>                         écrit :
>>>>>>>>                         On Fri, Mar 1, 2019 at 10:53 AM Myriam
>>>>>>>>                         Peyrounette via petsc-users
>>>>>>>>                         <petsc-users at mcs.anl.gov
>>>>>>>>                         <mailto:petsc-users at mcs.anl.gov>> wrote:
>>>>>>>>
>>>>>>>>                             Hi,
>>>>>>>>
>>>>>>>>                             I used to run my code with PETSc
>>>>>>>>                             3.6. Since I upgraded the PETSc version
>>>>>>>>                             to 3.10, this code has a bad memory
>>>>>>>>                             scaling.
>>>>>>>>
>>>>>>>>                             To report this issue, I took the
>>>>>>>>                             PETSc script ex42.c and slightly
>>>>>>>>                             modified it so that the KSP and PC
>>>>>>>>                             configurations are the same as in my
>>>>>>>>                             code. In particular, I use a
>>>>>>>>                             "personnalised" multi-grid method. The
>>>>>>>>                             modifications are indicated by the
>>>>>>>>                             keyword "TopBridge" in the attached
>>>>>>>>                             scripts.
>>>>>>>>
>>>>>>>>                             To plot the memory (weak) scaling,
>>>>>>>>                             I ran four calculations for each
>>>>>>>>                             script with increasing problem
>>>>>>>>                             sizes and computations cores:
>>>>>>>>
>>>>>>>>                             1. 100,000 elts on 4 cores
>>>>>>>>                             2. 1 million elts on 40 cores
>>>>>>>>                             3. 10 millions elts on 400 cores
>>>>>>>>                             4. 100 millions elts on 4,000 cores
>>>>>>>>
>>>>>>>>                             The resulting graph is also
>>>>>>>>                             attached. The scaling using PETSc 3.10
>>>>>>>>                             clearly deteriorates for large
>>>>>>>>                             cases, while the one using PETSc 3.6 is
>>>>>>>>                             robust.
>>>>>>>>
>>>>>>>>                             After a few tests, I found that the
>>>>>>>>                             scaling is mostly sensitive to the
>>>>>>>>                             use of the AMG method for the
>>>>>>>>                             coarse grid (line 1780 in
>>>>>>>>                             main_ex42_petsc36.cc). In
>>>>>>>>                             particular, the performance strongly
>>>>>>>>                             deteriorates when commenting lines
>>>>>>>>                             1777 to 1790 (in main_ex42_petsc36.cc).
>>>>>>>>
>>>>>>>>                             Do you have any idea of what
>>>>>>>>                             changed between version 3.6 and version
>>>>>>>>                             3.10 that may imply such degradation?
>>>>>>>>
>>>>>>>>
>>>>>>>>                         I believe the default values for PCGAMG
>>>>>>>>                         changed between versions. It sounds
>>>>>>>>                         like the coarsening rate
>>>>>>>>                         is not great enough, so that these
>>>>>>>>                         grids are too large. This can be set using:
>>>>>>>>
>>>>>>>>                           https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCGAMGSetThreshold.html
>>>>>>>>
>>>>>>>>                         There is some explanation of this
>>>>>>>>                         effect on that page. Let us know if
>>>>>>>>                         setting this does not correct the
>>>>>>>>                         situation.
>>>>>>>>
>>>>>>>>                           Thanks,
>>>>>>>>
>>>>>>>>                              Matt
>>>>>>>>                          
>>>>>>>>
>>>>>>>>                             Let me know if you need further
>>>>>>>>                             information.
>>>>>>>>
>>>>>>>>                             Best,
>>>>>>>>
>>>>>>>>                             Myriam Peyrounette
>>>>>>>>
>>>>>>>>
>>>>>>>>                             -- 
>>>>>>>>                             Myriam Peyrounette
>>>>>>>>                             CNRS/IDRIS - HLST
>>>>>>>>                             --
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>                         -- 
>>>>>>>>                         What most experimenters take for
>>>>>>>>                         granted before they begin their
>>>>>>>>                         experiments is infinitely more
>>>>>>>>                         interesting than any results to which
>>>>>>>>                         their experiments lead.
>>>>>>>>                         -- Norbert Wiener
>>>>>>>>
>>>>>>>>                         https://www.cse.buffalo.edu/~knepley/
>>>>>>>>                         <http://www.cse.buffalo.edu/%7Eknepley/>
>>>>>>>
>>>>>>>                         -- 
>>>>>>>                         Myriam Peyrounette
>>>>>>>                         CNRS/IDRIS - HLST
>>>>>>>                         --
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>                     -- 
>>>>>>>                     What most experimenters take for granted
>>>>>>>                     before they begin their experiments is
>>>>>>>                     infinitely more interesting than any results
>>>>>>>                     to which their experiments lead.
>>>>>>>                     -- Norbert Wiener
>>>>>>>
>>>>>>>                     https://www.cse.buffalo.edu/~knepley/
>>>>>>>                     <http://www.cse.buffalo.edu/%7Eknepley/>
>>>>>>
>>>>>>                     -- 
>>>>>>                     Myriam Peyrounette
>>>>>>                     CNRS/IDRIS - HLST
>>>>>>                     --
>>>>>>
>>>>>
>>>>>                 -- 
>>>>>                 Myriam Peyrounette
>>>>>                 CNRS/IDRIS - HLST
>>>>>                 --
>>>>>
>>>>
>>>>             -- 
>>>>             Myriam Peyrounette
>>>>             CNRS/IDRIS - HLST
>>>>             --
>>>
>>>             -- 
>>>             Myriam Peyrounette
>>>             CNRS/IDRIS - HLST
>>>             --
>>>
>>
>>         -- 
>>         Myriam Peyrounette
>>         CNRS/IDRIS - HLST
>>         --
>>
>>
>>
>>     -- 
>>     What most experimenters take for granted before they begin their
>>     experiments is infinitely more interesting than any results to
>>     which their experiments lead.
>>     -- Norbert Wiener
>>
>>     https://www.cse.buffalo.edu/~knepley/
>>     <http://www.cse.buffalo.edu/%7Eknepley/>
>
>     -- 
>     Myriam Peyrounette
>     CNRS/IDRIS - HLST
>     --
>

-- 
Myriam Peyrounette
CNRS/IDRIS - HLST
--

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20190326/9e6d1935/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2975 bytes
Desc: Signature cryptographique S/MIME
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20190326/9e6d1935/attachment-0001.p7s>