[petsc-users] Bad memory scaling with PETSc 3.10

Myriam Peyrounette myriam.peyrounette at idris.fr
Tue Apr 30 03:25:41 CDT 2019


Hi,

that's really good news for us, thanks! I will plot again the memory
scaling using these new options and let you know. Next week I hope.

Before that, I just need to clarify the situation. Throughout our
discussions, we mentionned a number of options concerning the scalability:

-matptatp_via scalable
-inner_diag_matmatmult_via scalable
-inner_diag_matmatmult_via scalable
-mat_freeintermediatedatastructures
-matptap_via allatonce
-matptap_via allatonce_merged

Which ones of them are compatible? Should I use all of them at the same
time? Is there redundancy?

Thanks,

Myriam


Le 04/25/19 à 21:47, Zhang, Hong a écrit :
> Myriam:
> Checking MatPtAP() in petsc-3.6.4, I realized that it uses different
> algorithm than petsc-10 and later versions. petsc-3.6 uses out-product
> for C=P^T * AP, while petsc-3.10 uses local transpose of P. petsc-3.10
> accelerates data accessing, but doubles the memory of P. 
>
> Fande added two new implementations for MatPtAP() to petsc-master
> which use much smaller and scalable memories with slightly higher
> computing time (faster than hypre though). You may use these new
> implementations if you have concern on memory scalability. The option
> for these new implementation are: 
> -matptap_via allatonce
> -matptap_via allatonce_merged
>
> Hong
>
> On Mon, Apr 15, 2019 at 12:10 PM hzhang at mcs.anl.gov
> <mailto:hzhang at mcs.anl.gov> <hzhang at mcs.anl.gov
> <mailto:hzhang at mcs.anl.gov>> wrote:
>
>     Myriam:
>     Thank you very much for providing these results!
>     I have put effort to accelerate execution time and avoid using
>     global sizes in PtAP, for which the algorithm of transpose of
>     P_local and P_other likely doubles the memory usage. I'll try to
>     investigate why it becomes unscalable.
>     Hong
>
>         Hi,
>
>         you'll find the new scaling attached (green line). I used the
>         version 3.11 and the four scalability options :
>         -matptap_via scalable
>         -inner_diag_matmatmult_via scalable
>         -inner_offdiag_matmatmult_via scalable
>         -mat_freeintermediatedatastructures
>
>         The scaling is much better! The code even uses less memory for
>         the smallest cases. There is still an increase for the larger
>         one.
>
>         With regard to the time scaling, I used KSPView and LogView on
>         the two previous scalings (blue and yellow lines) but not on
>         the last one (green line). So we can't really compare them, am
>         I right? However, we can see that the new time scaling looks
>         quite good. It slightly increases from ~8s to ~27s.
>
>         Unfortunately, the computations are expensive so I would like
>         to avoid re-run them if possible. How relevant would be a
>         proper time scaling for you? 
>
>         Myriam
>
>
>         Le 04/12/19 à 18:18, Zhang, Hong a écrit :
>>         Myriam :
>>         Thanks for your effort. It will help us improve PETSc.
>>         Hong
>>
>>             Hi all,
>>
>>             I used the wrong script, that's why it diverged... Sorry
>>             about that. 
>>             I tried again with the right script applied on a tiny
>>             problem (~200
>>             elements). I can see a small difference in memory usage
>>             (gain ~ 1mB).
>>             when adding the -mat_freeintermediatestructures option. I
>>             still have to
>>             execute larger cases to plot the scaling. The
>>             supercomputer I am used to
>>             run my jobs on is really busy at the moment so it takes a
>>             while. I hope
>>             I'll send you the results on Monday.
>>
>>             Thanks everyone,
>>
>>             Myriam
>>
>>
>>             Le 04/11/19 à 06:01, Jed Brown a écrit :
>>             > "Zhang, Hong" <hzhang at mcs.anl.gov
>>             <mailto:hzhang at mcs.anl.gov>> writes:
>>             >
>>             >> Jed:
>>             >>>> Myriam,
>>             >>>> Thanks for the plot.
>>             '-mat_freeintermediatedatastructures' should not affect
>>             solution. It releases almost half of memory in C=PtAP if
>>             C is not reused.
>>             >>> And yet if turning it on causes divergence, that
>>             would imply a bug.
>>             >>> Hong, are you able to reproduce the experiment to see
>>             the memory
>>             >>> scaling?
>>             >> I like to test his code using an alcf machine, but my
>>             hands are full now. I'll try it as soon as I find time,
>>             hopefully next week.
>>             > I have now compiled and run her code locally.
>>             >
>>             > Myriam, thanks for your last mail adding configuration
>>             and removing the
>>             > MemManager.h dependency.  I ran with and without
>>             > -mat_freeintermediatedatastructures and don't see a
>>             difference in
>>             > convergence.  What commands did you run to observe that
>>             difference?
>>
>>             -- 
>>             Myriam Peyrounette
>>             CNRS/IDRIS - HLST
>>             --
>>
>>
>
>         -- 
>         Myriam Peyrounette
>         CNRS/IDRIS - HLST
>         --
>

-- 
Myriam Peyrounette
CNRS/IDRIS - HLST
--

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20190430/9c78f543/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2975 bytes
Desc: Signature cryptographique S/MIME
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20190430/9c78f543/attachment-0001.p7s>


More information about the petsc-users mailing list