[petsc-users] Bad memory scaling with PETSc 3.10

Tue Mar 26 08:30:27 CDT 2019

On Tue, Mar 26, 2019 at 9:27 AM Myriam Peyrounette <
myriam.peyrounette at idris.fr> wrote:

> I checked with -ksp_view (attached) but no prefix is associated with the
> matrix. Some are associated to the KSP and PC, but none to the Mat
>
Another thing that could prevent options being used is that
*SetFromOptions() is not called for the object.

  Thanks,

     Matt

> Le 03/26/19 à 11:55, Dave May a écrit :
>
>
>
> On Tue, 26 Mar 2019 at 10:36, Myriam Peyrounette <
> myriam.peyrounette at idris.fr> wrote:
>
>> Oh you were right, the three options are unsused (-matptap_via scalable,
>> -inner_offdiag_matmatmult_via scalable and -inner_diag_matmatmult_via
>> scalable). Does this mean I am not using the associated PtAP functions?
>>
>
> No - not necessarily. All it means is the options were not parsed.
>
> If your matrices have an option prefix associated with them (e.g. abc) ,
> then you need to provide the option as
>   -abc_matptap_via scalable
>
> If you are not sure if you matrices have a prefix, look at the result of
> -ksp_view (see below for an example)
>
>   Mat Object: 2 MPI processes
>
>     type: mpiaij
>
>     rows=363, cols=363, bs=3
>
>     total: nonzeros=8649, allocated nonzeros=8649
>
>     total number of mallocs used during MatSetValues calls =0
>
>   Mat Object: (B_) 2 MPI processes
>
>     type: mpiaij
>
>     rows=363, cols=363, bs=3
>
>     total: nonzeros=8649, allocated nonzeros=8649
>
>     total number of mallocs used during MatSetValues calls =0
>
> The first matrix has no options prefix, but the second does and it's
> called "B_".
>
>
>
>
>
>> Myriam
>>
>> Le 03/26/19 à 11:10, Dave May a écrit :
>>
>>
>> On Tue, 26 Mar 2019 at 09:52, Myriam Peyrounette via petsc-users <
>> petsc-users at mcs.anl.gov> wrote:
>>
>>> How can I be sure they are indeed used? Can I print this information in
>>> some log file?
>>>
>> Yes. Re-run the job with the command line option
>>
>> -options_left true
>>
>> This will report all options parsed, and importantly, will also indicate
>> if any options were unused.
>>
>>
>> Thanks
>> Dave
>>
>> Thanks in advance
>>>
>>> Myriam
>>>
>>> Le 03/25/19 à 18:24, Matthew Knepley a écrit :
>>>
>>> On Mon, Mar 25, 2019 at 10:54 AM Myriam Peyrounette via petsc-users <
>>> petsc-users at mcs.anl.gov> wrote:
>>>
>>>> Hi,
>>>>
>>>> thanks for the explanations. I tried the last PETSc version (commit
>>>> fbc5705bc518d02a4999f188aad4ccff5f754cbf), which includes the patch you
>>>> talked about. But the memory scaling shows no improvement (see scaling
>>>> attached), even when using the "scalable" options :(
>>>>
>>>> I had a look at the PETSc functions MatPtAPNumeric_MPIAIJ_MPIAIJ and
>>>> MatPtAPSymbolic_MPIAIJ_MPIAIJ (especially at the differences before and
>>>> after the first "bad" commit), but I can't find what induced this memory
>>>> issue.
>>>>
>>> Are you sure that the option was used? It just looks suspicious to me
>>> that they use exactly the same amount of memory. It should be different,
>>> even if it does not solve the problem.
>>>
>>>    Thanks,
>>>
>>>      Matt
>>>
>>>> Myriam
>>>>
>>>>
>>>>
>>>>
>>>> Le 03/20/19 à 17:38, Fande Kong a écrit :
>>>>
>>>> Hi Myriam,
>>>>
>>>> There are three algorithms in PETSc to do PtAP ( const char
>>>> *algTypes[3] = {"scalable","nonscalable","hypre"};), and can be specified
>>>> using the petsc options: -matptap_via xxxx.
>>>>
>>>> (1) -matptap_via hypre: This call the hypre package to do the PtAP
>>>> trough an all-at-once triple product. In our experiences, it is the most
>>>> memory efficient, but could be slow.
>>>>
>>>> (2)  -matptap_via scalable: This involves a row-wise algorithm plus an
>>>> outer product.  This will use more memory than hypre, but way faster. This
>>>> used to have a bug that could take all your memory, and I have a fix at
>>>> https://bitbucket.org/petsc/petsc/pull-requests/1452/mpiptap-enable-large-scale-simulations/diff.
>>>> When using this option, we may want to have extra options such as
>>>>  -inner_offdiag_matmatmult_via scalable -inner_diag_matmatmult_via
>>>> scalable  to select inner scalable algorithms.
>>>>
>>>> (3)  -matptap_via nonscalable:  Suppose to be even faster, but use more
>>>> memory. It does dense matrix operations.
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> Fande Kong
>>>>
>>>>
>>>>
>>>>
>>>> On Wed, Mar 20, 2019 at 10:06 AM Myriam Peyrounette via petsc-users <
>>>> petsc-users at mcs.anl.gov> wrote:
>>>>
>>>>> More precisely: something happens when upgrading the functions
>>>>> MatPtAPNumeric_MPIAIJ_MPIAIJ and/or MatPtAPSymbolic_MPIAIJ_MPIAIJ.
>>>>>
>>>>> Unfortunately, there are a lot of differences between the old and new
>>>>> versions of these functions. I keep investigating but if you have any idea,
>>>>> please let me know.
>>>>>
>>>>> Best,
>>>>>
>>>>> Myriam
>>>>>
>>>>> Le 03/20/19 à 13:48, Myriam Peyrounette a écrit :
>>>>>
>>>>> Hi all,
>>>>>
>>>>> I used git bisect to determine when the memory need increased. I found
>>>>> that the first "bad" commit is   aa690a28a7284adb519c28cb44eae20a2c131c85.
>>>>>
>>>>> Barry was right, this commit seems to be about an evolution of MatPtAPSymbolic_MPIAIJ_MPIAIJ.
>>>>> You mentioned the option "-matptap_via scalable" but I can't find any
>>>>> information about it. Can you tell me more?
>>>>>
>>>>> Thanks
>>>>>
>>>>> Myriam
>>>>>
>>>>>
>>>>> Le 03/11/19 à 14:40, Mark Adams a écrit :
>>>>>
>>>>> Is there a difference in memory usage on your tiny problem? I assume
>>>>> no.
>>>>>
>>>>> I don't see anything that could come from GAMG other than the RAP
>>>>> stuff that you have discussed already.
>>>>>
>>>>> On Mon, Mar 11, 2019 at 9:32 AM Myriam Peyrounette <
>>>>> myriam.peyrounette at idris.fr> wrote:
>>>>>
>>>>>> The code I am using here is the example 42 of PETSc (
>>>>>> https://www.mcs.anl.gov/petsc/petsc-3.9/src/ksp/ksp/examples/tutorials/ex42.c.html).
>>>>>> Indeed it solves the Stokes equation. I thought it was a good idea to use
>>>>>> an example you might know (and didn't find any that uses GAMG functions). I
>>>>>> just changed the PCMG setup so that the memory problem appears. And it
>>>>>> appears when adding PCGAMG.
>>>>>>
>>>>>> I don't care about the performance or even the result rightness here,
>>>>>> but only about the difference in memory use between 3.6 and 3.10. Do you
>>>>>> think finding a more adapted script would help?
>>>>>>
>>>>>> I used the threshold of 0.1 only once, at the beginning, to test its
>>>>>> influence. I used the default threshold (of 0, I guess) for all the other
>>>>>> runs.
>>>>>>
>>>>>> Myriam
>>>>>>
>>>>>> Le 03/11/19 à 13:52, Mark Adams a écrit :
>>>>>>
>>>>>> In looking at this larger scale run ...
>>>>>>
>>>>>> * Your eigen estimates are much lower than your tiny test problem.
>>>>>> But this is Stokes apparently and it should not work anyway. Maybe you have
>>>>>> a small time step that adds a lot of mass that brings the eigen estimates
>>>>>> down. And your min eigenvalue (not used) is positive. I would expect
>>>>>> negative for Stokes ...
>>>>>>
>>>>>> * You seem to be setting a threshold value of 0.1 -- that is very high
>>>>>>
>>>>>> * v3.6 says "using nonzero initial guess" but this is not in v3.10.
>>>>>> Maybe we just stopped printing that.
>>>>>>
>>>>>> * There were some changes to coasening parameters in going from v3.6
>>>>>> but it does not look like your problem was effected. (The coarsening algo
>>>>>> is non-deterministic by default and you can see small difference on
>>>>>> different runs)
>>>>>>
>>>>>> * We may have also added a "noisy" RHS for eigen estimates by default
>>>>>> from v3.6.
>>>>>>
>>>>>> * And for non-symetric problems you can try -pc_gamg_agg_nsmooths 0,
>>>>>> but again GAMG is not built for Stokes anyway.
>>>>>>
>>>>>>
>>>>>> On Tue, Mar 5, 2019 at 11:53 AM Myriam Peyrounette <
>>>>>> myriam.peyrounette at idris.fr> wrote:
>>>>>>
>>>>>>> I used PCView to display the size of the linear system in each level
>>>>>>> of the MG. You'll find the outputs attached to this mail (zip file) for
>>>>>>> both the default threshold value and a value of 0.1, and for both 3.6 and
>>>>>>> 3.10 PETSc versions.
>>>>>>>
>>>>>>> For convenience, I summarized the information in a graph, also
>>>>>>> attached (png file).
>>>>>>>
>>>>>>> As you can see, there are slight differences between the two
>>>>>>> versions but none is critical, in my opinion. Do you see anything
>>>>>>> suspicious in the outputs?
>>>>>>>
>>>>>>> + I can't find the default threshold value. Do you know where I can
>>>>>>> find it?
>>>>>>>
>>>>>>> Thanks for the follow-up
>>>>>>>
>>>>>>> Myriam
>>>>>>>
>>>>>>> Le 03/05/19 à 14:06, Matthew Knepley a écrit :
>>>>>>>
>>>>>>> On Tue, Mar 5, 2019 at 7:14 AM Myriam Peyrounette <
>>>>>>> myriam.peyrounette at idris.fr> wrote:
>>>>>>>
>>>>>>>> Hi Matt,
>>>>>>>>
>>>>>>>> I plotted the memory scalings using different threshold values. The
>>>>>>>> two scalings are slightly translated (from -22 to -88 mB) but this gain is
>>>>>>>> neglectable. The 3.6-scaling keeps being robust while the 3.10-scaling
>>>>>>>> deteriorates.
>>>>>>>>
>>>>>>>> Do you have any other suggestion?
>>>>>>>>
>>>>>>> Mark, what is the option she can give to output all the GAMG data?
>>>>>>>
>>>>>>> Also, run using -ksp_view. GAMG will report all the sizes of its
>>>>>>> grids, so it should be easy to see
>>>>>>> if the coarse grid sizes are increasing, and also what the effect of
>>>>>>> the threshold value is.
>>>>>>>
>>>>>>>   Thanks,
>>>>>>>
>>>>>>>      Matt
>>>>>>>
>>>>>>>> Thanks
>>>>>>>> Myriam
>>>>>>>>
>>>>>>>> Le 03/02/19 à 02:27, Matthew Knepley a écrit :
>>>>>>>>
>>>>>>>> On Fri, Mar 1, 2019 at 10:53 AM Myriam Peyrounette via petsc-users <
>>>>>>>> petsc-users at mcs.anl.gov> wrote:
>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> I used to run my code with PETSc 3.6. Since I upgraded the PETSc
>>>>>>>>> version
>>>>>>>>> to 3.10, this code has a bad memory scaling.
>>>>>>>>>
>>>>>>>>> To report this issue, I took the PETSc script ex42.c and slightly
>>>>>>>>> modified it so that the KSP and PC configurations are the same as
>>>>>>>>> in my
>>>>>>>>> code. In particular, I use a "personnalised" multi-grid method. The
>>>>>>>>> modifications are indicated by the keyword "TopBridge" in the
>>>>>>>>> attached
>>>>>>>>> scripts.
>>>>>>>>>
>>>>>>>>> To plot the memory (weak) scaling, I ran four calculations for each
>>>>>>>>> script with increasing problem sizes and computations cores:
>>>>>>>>>
>>>>>>>>> 1. 100,000 elts on 4 cores
>>>>>>>>> 2. 1 million elts on 40 cores
>>>>>>>>> 3. 10 millions elts on 400 cores
>>>>>>>>> 4. 100 millions elts on 4,000 cores
>>>>>>>>>
>>>>>>>>> The resulting graph is also attached. The scaling using PETSc 3.10
>>>>>>>>> clearly deteriorates for large cases, while the one using PETSc
>>>>>>>>> 3.6 is
>>>>>>>>> robust.
>>>>>>>>>
>>>>>>>>> After a few tests, I found that the scaling is mostly sensitive to
>>>>>>>>> the
>>>>>>>>> use of the AMG method for the coarse grid (line 1780 in
>>>>>>>>> main_ex42_petsc36.cc). In particular, the performance strongly
>>>>>>>>> deteriorates when commenting lines 1777 to 1790 (in
>>>>>>>>> main_ex42_petsc36.cc).
>>>>>>>>>
>>>>>>>>> Do you have any idea of what changed between version 3.6 and
>>>>>>>>> version
>>>>>>>>> 3.10 that may imply such degradation?
>>>>>>>>>
>>>>>>>>
>>>>>>>> I believe the default values for PCGAMG changed between versions.
>>>>>>>> It sounds like the coarsening rate
>>>>>>>> is not great enough, so that these grids are too large. This can be
>>>>>>>> set using:
>>>>>>>>
>>>>>>>>
>>>>>>>> https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCGAMGSetThreshold.html
>>>>>>>>
>>>>>>>> There is some explanation of this effect on that page. Let us know
>>>>>>>> if setting this does not correct the situation.
>>>>>>>>
>>>>>>>>   Thanks,
>>>>>>>>
>>>>>>>>      Matt
>>>>>>>>
>>>>>>>>
>>>>>>>>> Let me know if you need further information.
>>>>>>>>>
>>>>>>>>> Best,
>>>>>>>>>
>>>>>>>>> Myriam Peyrounette
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Myriam Peyrounette
>>>>>>>>> CNRS/IDRIS - HLST
>>>>>>>>> --
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> What most experimenters take for granted before they begin their
>>>>>>>> experiments is infinitely more interesting than any results to which their
>>>>>>>> experiments lead.
>>>>>>>> -- Norbert Wiener
>>>>>>>>
>>>>>>>> https://www.cse.buffalo.edu/~knepley/
>>>>>>>> <http://www.cse.buffalo.edu/%7Eknepley/>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Myriam Peyrounette
>>>>>>>> CNRS/IDRIS - HLST
>>>>>>>> --
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> What most experimenters take for granted before they begin their
>>>>>>> experiments is infinitely more interesting than any results to which their
>>>>>>> experiments lead.
>>>>>>> -- Norbert Wiener
>>>>>>>
>>>>>>> https://www.cse.buffalo.edu/~knepley/
>>>>>>> <http://www.cse.buffalo.edu/%7Eknepley/>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Myriam Peyrounette
>>>>>>> CNRS/IDRIS - HLST
>>>>>>> --
>>>>>>>
>>>>>>>
>>>>>> --
>>>>>> Myriam Peyrounette
>>>>>> CNRS/IDRIS - HLST
>>>>>> --
>>>>>>
>>>>>>
>>>>> --
>>>>> Myriam Peyrounette
>>>>> CNRS/IDRIS - HLST
>>>>> --
>>>>>
>>>>>
>>>>> --
>>>>> Myriam Peyrounette
>>>>> CNRS/IDRIS - HLST
>>>>> --
>>>>>
>>>>>
>>>> --
>>>> Myriam Peyrounette
>>>> CNRS/IDRIS - HLST
>>>> --
>>>>
>>>>
>>>
>>> --
>>> What most experimenters take for granted before they begin their
>>> experiments is infinitely more interesting than any results to which their
>>> experiments lead.
>>> -- Norbert Wiener
>>>
>>> https://www.cse.buffalo.edu/~knepley/
>>> <http://www.cse.buffalo.edu/%7Eknepley/>
>>>
>>>
>>> --
>>> Myriam Peyrounette
>>> CNRS/IDRIS - HLST
>>> --
>>>
>>>
>> --
>> Myriam Peyrounette
>> CNRS/IDRIS - HLST
>> --
>>
>>
> --
> Myriam Peyrounette
> CNRS/IDRIS - HLST
> --
>
>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20190326/1cfc8ddd/attachment-0001.html>