[petsc-dev] [petsc-users] Bad memory scaling with PETSc 3.10

Fande Kong fdkong.jd at gmail.com
Thu Mar 21 12:08:36 CDT 2019

Hi Mark,

Thanks for your email.

On Thu, Mar 21, 2019 at 6:39 AM Mark Adams via petsc-dev <
petsc-dev at mcs.anl.gov> wrote:

> I'm probably screwing up some sort of history by jumping into dev, but
> this is a dev comment ...
> (1) -matptap_via hypre: This call the hypre package to do the PtAP trough
>> an all-at-once triple product. In our experiences, it is the most memory
>> efficient, but could be slow.
> FYI,
> I visited LLNL in about 1997 and told them how I did RAP. Simple 4 nested
> loops. They were very interested. Clearly they did it this way after I
> talked to them. This approach came up here a while back (eg, we should
> offer this as an option).
> Anecdotally, I don't see a noticeable difference in performance on my 3D
> elasticity problems between my old code (still used by the bone modeling
> people) and ex56 ...

You may not see differences when the problem is small.  What I observed is
that the HYPRE PtAP is ten times slower than the PETSc scalable PtAP when
we had a 3-billions problem on 10K processor cores.

> My kernel is an unrolled dense matrix triple product. I doubt Hypre did
> this. It ran at about 2x+ the flop rate of the mat-vec at scale on the SP3
> in 2004.

Could you explain this more by adding some small examples?

 I am profiling the current PETSc algorithms on some real simulations. If
PETSc PtAP still takes more memory than desired with my fix (
https://bitbucket.org/petsc/petsc/pull-requests/1452), I am going to
implement the all-at-once triple product with dropping all intermediate
data.  If you have any documents (except the code you posted before), it
would be a great help.


> Mark
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20190321/04182753/attachment.html>

More information about the petsc-dev mailing list