[petsc-users] Bad memory scaling with PETSc 3.10
Fande Kong
fdkong.jd at gmail.com
Fri May 3 09:45:20 CDT 2019
On Fri, May 3, 2019 at 8:11 AM Myriam Peyrounette <
myriam.peyrounette at idris.fr> wrote:
> Hi,
>
> I plotted new scalings (memory and time) using the new algorithms. I used
> the options *-options_left true *to make sure that the options are
> effectively used. They are.
>
> I don't have access to the platform I used to run my computations on, so I
> ran them on a different one. In particular, I can't reach problem size =
> 1e8 and the values might be different from the previous scalings I sent
> you. But the comparison of the PETSc versions and options is still
> relevant.
>
> I plotted the scalings of reference: the "good" one (PETSc 3.6.4) in
> green, the "bad" one (PETSc 3.10.2) in blue.
>
> I used the commit d330a26 (3.11.1) for all the other scalings, adding
> different sets of options:
>
> *Light blue* -> -matptap_via
> allatonce -mat_freeintermediatedatastructures 1
> *Orange* -> -matptap_via allatonce_*merged* -mat_freeintermediatedatastructures
> 1
>
As said earlier, you should use these two combinations only.
> *Purple* -> -matptap_via allatonce -mat_freeintermediatedatastructures 1 *-inner_diag_matmatmult_via
> scalable -inner_offdiag_matmatmult_via scalable*
>
Do not use it since it does not make any sense. The new algorithm does not
need *-inner_diag_matmatmult_via scalable -inner_offdiag_matmatmult_via
scalable*
> *Yellow*: -matptap_via allatonce_*merged* -mat_freeintermediatedatastructures
> 1 *-inner_diag_matmatmult_via scalable -inner_offdiag_matmatmult_via
> scalable*
>
Do not use these.
> Conclusion: with regard to memory, the two algorithms imply a similarly
> good improvement of the scaling. The use of the
> -inner_(off)diag_matmatmult_via options is also very interesting.
>
The use of the -inner_(off)diag_matmatmult_via should not change anything
since I do not need these options at all in ``allatonce" and
``allatonce_merged".
Thanks,
Fande,
The scaling is still not as good as 3.6.4 though.
> With regard to time, I noted a real improvement in time execution! I used
> to spend 200-300s on these executions
>
Now they take 10-15s.
>
It is interesting. I observed this similar behavior when the problem size
is small for mat/ex96.c. Their performance will be very close when the
problem is large.
Thanks,
Fande,
Beside that, the "_merged" versions are more efficient. And the
> -inner_(off)diaf_matmatmult_via options are slightly expensive but it is
> not critical.
>
> What do you think? Is it possible to match again the scaling of PETSc
> 3.6.4? Is it worthy keeping investigating?
>
> Myriam
>
>
> Le 04/30/19 à 17:00, Fande Kong a écrit :
>
> HI Myriam,
>
> We are interesting how the new algorithms perform. So there are two new
> algorithms you could try.
>
> Algorithm 1:
>
> -matptap_via allatonce -mat_freeintermediatedatastructures 1
>
> Algorithm 2:
>
> -matptap_via allatonce_merged -mat_freeintermediatedatastructures 1
>
>
> Note that you need to use the current petsc-master, and also please put
> "-snes_view" in your script so that we can confirm these options are
> actually get set.
>
> Thanks,
>
> Fande,
>
>
> On Tue, Apr 30, 2019 at 2:26 AM Myriam Peyrounette via petsc-users <
> petsc-users at mcs.anl.gov> wrote:
>
>> Hi,
>>
>> that's really good news for us, thanks! I will plot again the memory
>> scaling using these new options and let you know. Next week I hope.
>>
>> Before that, I just need to clarify the situation. Throughout our
>> discussions, we mentionned a number of options concerning the scalability:
>>
>> -matptatp_via scalable
>> -inner_diag_matmatmult_via scalable
>> -inner_diag_matmatmult_via scalable
>> -mat_freeintermediatedatastructures
>> -matptap_via allatonce
>> -matptap_via allatonce_merged
>>
>> Which ones of them are compatible? Should I use all of them at the same
>> time? Is there redundancy?
>>
>> Thanks,
>>
>> Myriam
>>
>> Le 04/25/19 à 21:47, Zhang, Hong a écrit :
>>
>> Myriam:
>> Checking MatPtAP() in petsc-3.6.4, I realized that it uses different
>> algorithm than petsc-10 and later versions. petsc-3.6 uses out-product for
>> C=P^T * AP, while petsc-3.10 uses local transpose of P. petsc-3.10
>> accelerates data accessing, but doubles the memory of P.
>>
>> Fande added two new implementations for MatPtAP() to petsc-master which
>> use much smaller and scalable memories with slightly higher computing time
>> (faster than hypre though). You may use these new implementations if you
>> have concern on memory scalability. The option for these new implementation
>> are:
>> -matptap_via allatonce
>> -matptap_via allatonce_merged
>>
>> Hong
>>
>> On Mon, Apr 15, 2019 at 12:10 PM hzhang at mcs.anl.gov <hzhang at mcs.anl.gov>
>> wrote:
>>
>>> Myriam:
>>> Thank you very much for providing these results!
>>> I have put effort to accelerate execution time and avoid using global
>>> sizes in PtAP, for which the algorithm of transpose of P_local and P_other
>>> likely doubles the memory usage. I'll try to investigate why it becomes
>>> unscalable.
>>> Hong
>>>
>>>> Hi,
>>>>
>>>> you'll find the new scaling attached (green line). I used the version
>>>> 3.11 and the four scalability options :
>>>> -matptap_via scalable
>>>> -inner_diag_matmatmult_via scalable
>>>> -inner_offdiag_matmatmult_via scalable
>>>> -mat_freeintermediatedatastructures
>>>>
>>>> The scaling is much better! The code even uses less memory for the
>>>> smallest cases. There is still an increase for the larger one.
>>>>
>>>> With regard to the time scaling, I used KSPView and LogView on the two
>>>> previous scalings (blue and yellow lines) but not on the last one (green
>>>> line). So we can't really compare them, am I right? However, we can see
>>>> that the new time scaling looks quite good. It slightly increases from ~8s
>>>> to ~27s.
>>>>
>>>> Unfortunately, the computations are expensive so I would like to avoid
>>>> re-run them if possible. How relevant would be a proper time scaling for
>>>> you?
>>>>
>>>> Myriam
>>>>
>>>> Le 04/12/19 à 18:18, Zhang, Hong a écrit :
>>>>
>>>> Myriam :
>>>> Thanks for your effort. It will help us improve PETSc.
>>>> Hong
>>>>
>>>> Hi all,
>>>>>
>>>>> I used the wrong script, that's why it diverged... Sorry about that.
>>>>> I tried again with the right script applied on a tiny problem (~200
>>>>> elements). I can see a small difference in memory usage (gain ~ 1mB).
>>>>> when adding the -mat_freeintermediatestructures option. I still have to
>>>>> execute larger cases to plot the scaling. The supercomputer I am used
>>>>> to
>>>>> run my jobs on is really busy at the moment so it takes a while. I hope
>>>>> I'll send you the results on Monday.
>>>>>
>>>>> Thanks everyone,
>>>>>
>>>>> Myriam
>>>>>
>>>>>
>>>>> Le 04/11/19 à 06:01, Jed Brown a écrit :
>>>>> > "Zhang, Hong" <hzhang at mcs.anl.gov> writes:
>>>>> >
>>>>> >> Jed:
>>>>> >>>> Myriam,
>>>>> >>>> Thanks for the plot. '-mat_freeintermediatedatastructures' should
>>>>> not affect solution. It releases almost half of memory in C=PtAP if C is
>>>>> not reused.
>>>>> >>> And yet if turning it on causes divergence, that would imply a bug.
>>>>> >>> Hong, are you able to reproduce the experiment to see the memory
>>>>> >>> scaling?
>>>>> >> I like to test his code using an alcf machine, but my hands are
>>>>> full now. I'll try it as soon as I find time, hopefully next week.
>>>>> > I have now compiled and run her code locally.
>>>>> >
>>>>> > Myriam, thanks for your last mail adding configuration and removing
>>>>> the
>>>>> > MemManager.h dependency. I ran with and without
>>>>> > -mat_freeintermediatedatastructures and don't see a difference in
>>>>> > convergence. What commands did you run to observe that difference?
>>>>>
>>>>> --
>>>>> Myriam Peyrounette
>>>>> CNRS/IDRIS - HLST
>>>>> --
>>>>>
>>>>>
>>>>>
>>>> --
>>>> Myriam Peyrounette
>>>> CNRS/IDRIS - HLST
>>>> --
>>>>
>>>>
>> --
>> Myriam Peyrounette
>> CNRS/IDRIS - HLST
>> --
>>
>>
> --
> Myriam Peyrounette
> CNRS/IDRIS - HLST
> --
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20190503/eea42796/attachment-0001.html>
More information about the petsc-users
mailing list