[petsc-users] Bad memory scaling with PETSc 3.10
Myriam Peyrounette
myriam.peyrounette at idris.fr
Fri May 3 09:14:23 CDT 2019
And the attached files... Sorry
Le 05/03/19 à 16:11, Myriam Peyrounette a écrit :
>
> Hi,
>
> I plotted new scalings (memory and time) using the new algorithms. I
> used the options /-options_left true /to make sure that the options
> are effectively used. They are.
>
> I don't have access to the platform I used to run my computations on,
> so I ran them on a different one. In particular, I can't reach problem
> size = 1e8 and the values might be different from the previous
> scalings I sent you. But the comparison of the PETSc versions and
> options is still relevant.
>
> I plotted the scalings of reference: the "good" one (PETSc 3.6.4) in
> green, the "bad" one (PETSc 3.10.2) in blue.
>
> I used the commit d330a26 (3.11.1) for all the other scalings, adding
> different sets of options:
>
> /Light blue/ -> -matptap_via
> allatonce -mat_freeintermediatedatastructures 1
> /Orange/ -> -matptap_via
> allatonce_*merged* -mat_freeintermediatedatastructures 1
> /Purple/ -> -matptap_via
> allatonce -mat_freeintermediatedatastructures 1
> *-inner_diag_matmatmult_via scalable -inner_offdiag_matmatmult_via
> scalable*
> /Yellow/: -matptap_via
> allatonce_*merged* -mat_freeintermediatedatastructures 1
> *-inner_diag_matmatmult_via scalable -inner_offdiag_matmatmult_via
> scalable*
>
> Conclusion: with regard to memory, the two algorithms imply a
> similarly good improvement of the scaling. The use of the
> -inner_(off)diag_matmatmult_via options is also very interesting. The
> scaling is still not as good as 3.6.4 though.
> With regard to time, I noted a real improvement in time execution! I
> used to spend 200-300s on these executions. Now they take 10-15s.
> Beside that, the "_merged" versions are more efficient. And the
> -inner_(off)diaf_matmatmult_via options are slightly expensive but it
> is not critical.
>
> What do you think? Is it possible to match again the scaling of PETSc
> 3.6.4? Is it worthy keeping investigating?
>
> Myriam
>
>
> Le 04/30/19 à 17:00, Fande Kong a écrit :
>> HI Myriam,
>>
>> We are interesting how the new algorithms perform. So there are two
>> new algorithms you could try.
>>
>> Algorithm 1:
>>
>> -matptap_via allatonce -mat_freeintermediatedatastructures 1
>>
>> Algorithm 2:
>>
>> -matptap_via allatonce_merged -mat_freeintermediatedatastructures 1
>>
>>
>> Note that you need to use the current petsc-master, and also please
>> put "-snes_view" in your script so that we can confirm these options
>> are actually get set.
>>
>> Thanks,
>>
>> Fande,
>>
>>
>> On Tue, Apr 30, 2019 at 2:26 AM Myriam Peyrounette via petsc-users
>> <petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov>> wrote:
>>
>> Hi,
>>
>> that's really good news for us, thanks! I will plot again the
>> memory scaling using these new options and let you know. Next
>> week I hope.
>>
>> Before that, I just need to clarify the situation. Throughout our
>> discussions, we mentionned a number of options concerning the
>> scalability:
>>
>> -matptatp_via scalable
>> -inner_diag_matmatmult_via scalable
>> -inner_diag_matmatmult_via scalable
>> -mat_freeintermediatedatastructures
>> -matptap_via allatonce
>> -matptap_via allatonce_merged
>>
>> Which ones of them are compatible? Should I use all of them at
>> the same time? Is there redundancy?
>>
>> Thanks,
>>
>> Myriam
>>
>>
>> Le 04/25/19 à 21:47, Zhang, Hong a écrit :
>>> Myriam:
>>> Checking MatPtAP() in petsc-3.6.4, I realized that it uses
>>> different algorithm than petsc-10 and later versions. petsc-3.6
>>> uses out-product for C=P^T * AP, while petsc-3.10 uses local
>>> transpose of P. petsc-3.10 accelerates data accessing, but
>>> doubles the memory of P.
>>>
>>> Fande added two new implementations for MatPtAP() to
>>> petsc-master which use much smaller and scalable memories with
>>> slightly higher computing time (faster than hypre though). You
>>> may use these new implementations if you have concern on memory
>>> scalability. The option for these new implementation are:
>>> -matptap_via allatonce
>>> -matptap_via allatonce_merged
>>>
>>> Hong
>>>
>>> On Mon, Apr 15, 2019 at 12:10 PM hzhang at mcs.anl.gov
>>> <mailto:hzhang at mcs.anl.gov> <hzhang at mcs.anl.gov
>>> <mailto:hzhang at mcs.anl.gov>> wrote:
>>>
>>> Myriam:
>>> Thank you very much for providing these results!
>>> I have put effort to accelerate execution time and avoid
>>> using global sizes in PtAP, for which the algorithm of
>>> transpose of P_local and P_other likely doubles the memory
>>> usage. I'll try to investigate why it becomes unscalable.
>>> Hong
>>>
>>> Hi,
>>>
>>> you'll find the new scaling attached (green line). I
>>> used the version 3.11 and the four scalability options :
>>> -matptap_via scalable
>>> -inner_diag_matmatmult_via scalable
>>> -inner_offdiag_matmatmult_via scalable
>>> -mat_freeintermediatedatastructures
>>>
>>> The scaling is much better! The code even uses less
>>> memory for the smallest cases. There is still an
>>> increase for the larger one.
>>>
>>> With regard to the time scaling, I used KSPView and
>>> LogView on the two previous scalings (blue and yellow
>>> lines) but not on the last one (green line). So we can't
>>> really compare them, am I right? However, we can see
>>> that the new time scaling looks quite good. It slightly
>>> increases from ~8s to ~27s.
>>>
>>> Unfortunately, the computations are expensive so I would
>>> like to avoid re-run them if possible. How relevant
>>> would be a proper time scaling for you?
>>>
>>> Myriam
>>>
>>>
>>> Le 04/12/19 à 18:18, Zhang, Hong a écrit :
>>>> Myriam :
>>>> Thanks for your effort. It will help us improve PETSc.
>>>> Hong
>>>>
>>>> Hi all,
>>>>
>>>> I used the wrong script, that's why it diverged...
>>>> Sorry about that.
>>>> I tried again with the right script applied on a
>>>> tiny problem (~200
>>>> elements). I can see a small difference in memory
>>>> usage (gain ~ 1mB).
>>>> when adding the -mat_freeintermediatestructures
>>>> option. I still have to
>>>> execute larger cases to plot the scaling. The
>>>> supercomputer I am used to
>>>> run my jobs on is really busy at the moment so it
>>>> takes a while. I hope
>>>> I'll send you the results on Monday.
>>>>
>>>> Thanks everyone,
>>>>
>>>> Myriam
>>>>
>>>>
>>>> Le 04/11/19 à 06:01, Jed Brown a écrit :
>>>> > "Zhang, Hong" <hzhang at mcs.anl.gov
>>>> <mailto:hzhang at mcs.anl.gov>> writes:
>>>> >
>>>> >> Jed:
>>>> >>>> Myriam,
>>>> >>>> Thanks for the plot.
>>>> '-mat_freeintermediatedatastructures' should not
>>>> affect solution. It releases almost half of memory
>>>> in C=PtAP if C is not reused.
>>>> >>> And yet if turning it on causes divergence,
>>>> that would imply a bug.
>>>> >>> Hong, are you able to reproduce the experiment
>>>> to see the memory
>>>> >>> scaling?
>>>> >> I like to test his code using an alcf machine,
>>>> but my hands are full now. I'll try it as soon as I
>>>> find time, hopefully next week.
>>>> > I have now compiled and run her code locally.
>>>> >
>>>> > Myriam, thanks for your last mail adding
>>>> configuration and removing the
>>>> > MemManager.h dependency. I ran with and without
>>>> > -mat_freeintermediatedatastructures and don't see
>>>> a difference in
>>>> > convergence. What commands did you run to
>>>> observe that difference?
>>>>
>>>> --
>>>> Myriam Peyrounette
>>>> CNRS/IDRIS - HLST
>>>> --
>>>>
>>>>
>>>
>>> --
>>> Myriam Peyrounette
>>> CNRS/IDRIS - HLST
>>> --
>>>
>>
>> --
>> Myriam Peyrounette
>> CNRS/IDRIS - HLST
>> --
>>
>
> --
> Myriam Peyrounette
> CNRS/IDRIS - HLST
> --
--
Myriam Peyrounette
CNRS/IDRIS - HLST
--
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20190503/1514baf4/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ex42_mem_scaling_ada.png
Type: image/png
Size: 48984 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20190503/1514baf4/attachment-0002.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ex42_time_scaling_ada.png
Type: image/png
Size: 36796 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20190503/1514baf4/attachment-0003.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2975 bytes
Desc: Signature cryptographique S/MIME
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20190503/1514baf4/attachment-0001.p7s>
More information about the petsc-users
mailing list