[petsc-users] petsc4py - Spike in memory usage when loading a matrix in parallel
Michael Werner
michael.werner at dlr.de
Thu Oct 7 10:35:44 CDT 2021
Currently I'm using psutil to query every process for its memory usage
and sum it up. However, the spike was only visible in top (I had a call
to psutil right before and after A.load(viewer), and both reported only
50 GB of RAM usage). That's why I thought it might be directly tied to
loading the matrix. However, I also had the problem that the computation
crashed due to running out of memory while loading a matrix that should
in theory fit into memory. In that case I would expect the OS to free
unused meory immediatly, right?
Concerning Barry's questions: the matrix is a sparse matrix and is
originally created sequentially as SEQAIJ. However, it is then loaded as
MPIAIJ, and if I look at the memory usage of the various processes, they
fill up one after another, just as described. Is the origin of the
matrix somehow preserved in the binary file? I was under the impression
that the binary format was agnostic to the number of processes? I also
varied the number of processes between 1 and 60, as soon as I use more
than one process I can observe the spike (and its always twice the
memory, no matter how many processes I'm using).
I also tried running Valgrind with the --tool=massif option. However, I
don't know what to look for. I can send you the output file separately,
if it helps.
Best regards,
Michael
On 07.10.21 16:09, Matthew Knepley wrote:
> On Thu, Oct 7, 2021 at 10:03 AM Barry Smith <bsmith at petsc.dev
> <mailto:bsmith at petsc.dev>> wrote:
>
>
> How many ranks are you using? Is it a sparse matrix with MPIAIJ?
>
> The intention is that for parallel runs the first rank reads in
> its own part of the matrix, then reads in the part of the next
> rank and sends it, then reads the part of the third rank and sends
> it etc. So there should not be too much of a blip in memory usage.
> You can run valgrind with the option for tracking memory usage to
> see exactly where in the code the blip occurs; it could be a
> regression occurred in the code making it require more memory. But
> internal MPI buffers might explain some blip.
>
>
> Is it possible that we free the memory, but the OS has just not given
> back that memory for use yet? How are you measuring memory usage?
>
> Thanks,
>
> Matt
>
>
> Barry
>
>
> > On Oct 7, 2021, at 9:50 AM, Michael Werner
> <michael.werner at dlr.de <mailto:michael.werner at dlr.de>> wrote:
> >
> > Hello,
> >
> > I noticed that there is a peak in memory consumption when I load an
> > existing matrix into PETSc. The matrix is previously created by an
> > external program and saved in the PETSc binary format.
> > The code I'm using in petsc4py is simple:
> >
> > viewer = PETSc.Viewer().createBinary(<path/to/existing/matrix>, "r",
> > comm=PETSc.COMM_WORLD)
> > A = PETSc.Mat().create(comm=PETSc.COMM_WORLD)
> > A.load(viewer)
> >
> > When I run this code in serial, the memory consumption of the
> process is
> > about 50GB RAM, similar to the file size of the saved matrix.
> However,
> > if I run the code in parallel, for a few seconds the memory
> consumption
> > of the process doubles to around 100GB RAM, before dropping back
> down to
> > around 50GB RAM. So it seems as if, for some reason, the matrix is
> > copied after it is read into memory. Is there a way to avoid this
> > behaviour? Currently, it is a clear bottleneck in my code.
> >
> > I tried setting the size of the matrix and to explicitly
> preallocate the
> > necessary NNZ (with A.setSizes(dim) and A.setPreallocationNNZ(nnz),
> > respectively) before loading, but that didn't help.
> >
> > As mentioned above, I'm using petsc4py together with PETSc-3.16 on a
> > Linux workstation.
> >
> > Best regards,
> > Michael Werner
> >
> > --
> >
> > ____________________________________________________
> >
> > Deutsches Zentrum für Luft- und Raumfahrt e.V. (DLR)
> > Institut für Aerodynamik und Strömungstechnik | Bunsenstr. 10 |
> 37073 Göttingen
> >
> > Michael Werner
> > Telefon 0551 709-2627 | Telefax 0551 709-2811 |
> Michael.Werner at dlr.de <mailto:Michael.Werner at dlr.de>
> > DLR.de
> >
> >
> >
> >
> >
> >
> >
> >
> >
>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which
> their experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20211007/73b5a5ac/attachment.html>
More information about the petsc-users
mailing list