[petsc-users] petsc4py - Spike in memory usage when loading a matrix in parallel

Thu Oct 7 10:59:57 CDT 2021

Its twice the memory of the entire matrix (when stored on one process).
I also just sent you the valgrind results, both for a serial run and a
parallel run. The size on disk of the matrix I used is 20 GB.
In the serial run, valgrind shows a peak memory usage of 21GB, while in
the parallel run (with 4 processes) each process shows a peak memory
usage of 10.8GB

Best regards,
Michael

On 07.10.21 17:55, Barry Smith wrote:
>
>
>> On Oct 7, 2021, at 11:35 AM, Michael Werner <michael.werner at dlr.de
>> <mailto:michael.werner at dlr.de>> wrote:
>>
>> Currently I'm using psutil to query every process for its memory
>> usage and sum it up. However, the spike was only visible in top (I
>> had a call to psutil right before and after A.load(viewer), and both
>> reported only 50 GB of RAM usage). That's why I thought it might be
>> directly tied to loading the matrix. However, I also had the problem
>> that the computation crashed due to running out of memory while
>> loading a matrix that should in theory fit into memory. In that case
>> I would expect the OS to free unused meory immediatly, right?
>>
>> Concerning Barry's questions: the matrix is a sparse matrix and is
>> originally created sequentially as SEQAIJ. However, it is then loaded
>> as MPIAIJ, and if I look at the memory usage of the various
>> processes, they fill up one after another, just as described. Is the
>> origin of the matrix somehow preserved in the binary file? I was
>> under the impression that the binary format was agnostic to the
>> number of processes?
>
>  The file format is independent of the number of processes that
> created it.
>
>> I also varied the number of processes between 1 and 60, as soon as I
>> use more than one process I can observe the spike (and its always
>> twice the memory, no matter how many processes I'm using).
>
>   Twice the size of the entire matrix (when stored on one process) or
> twice the size of the resulting matrix stored on the first rank? The
> latter is exactly as expected, since rank 0 has to load the part of
> the matrix destined for the next rank and hence for a short time
> contains its own part of the matrix and the part of one other rank.
>
>   Barry
>
>>
>> I also tried running Valgrind with the --tool=massif option. However,
>> I don't know what to look for. I can send you the output file
>> separately, if it helps.
>>
>> Best regards,
>> Michael
>>
>> On 07.10.21 16:09, Matthew Knepley wrote:
>>> On Thu, Oct 7, 2021 at 10:03 AM Barry Smith <bsmith at petsc.dev
>>> <mailto:bsmith at petsc.dev>> wrote:
>>>
>>>
>>>        How many ranks are you using? Is it a sparse matrix with MPIAIJ?
>>>
>>>        The intention is that for parallel runs the first rank reads
>>>     in its own part of the matrix, then reads in the part of the
>>>     next rank and sends it, then reads the part of the third rank
>>>     and sends it etc. So there should not be too much of a blip in
>>>     memory usage. You can run valgrind with the option for tracking
>>>     memory usage to see exactly where in the code the blip occurs;
>>>     it could be a regression occurred in the code making it require
>>>     more memory. But internal MPI buffers might explain some blip.
>>>
>>>
>>> Is it possible that we free the memory, but the OS has just not
>>> given back that memory for use yet? How are you measuring memory usage?
>>>
>>>   Thanks,
>>>
>>>      Matt
>>>  
>>>
>>>       Barry
>>>
>>>
>>>     > On Oct 7, 2021, at 9:50 AM, Michael Werner
>>>     <michael.werner at dlr.de <mailto:michael.werner at dlr.de>> wrote:
>>>     >
>>>     > Hello,
>>>     >
>>>     > I noticed that there is a peak in memory consumption when I
>>>     load an
>>>     > existing matrix into PETSc. The matrix is previously created by an
>>>     > external program and saved in the PETSc binary format.
>>>     > The code I'm using in petsc4py is simple:
>>>     >
>>>     > viewer =
>>>     PETSc.Viewer().createBinary(<path/to/existing/matrix>, "r",
>>>     > comm=PETSc.COMM_WORLD)
>>>     > A = PETSc.Mat().create(comm=PETSc.COMM_WORLD)
>>>     > A.load(viewer)
>>>     >
>>>     > When I run this code in serial, the memory consumption of the
>>>     process is
>>>     > about 50GB RAM, similar to the file size of the saved matrix.
>>>     However,
>>>     > if I run the code in parallel, for a few seconds the memory
>>>     consumption
>>>     > of the process doubles to around 100GB RAM, before dropping
>>>     back down to
>>>     > around 50GB RAM. So it seems as if, for some reason, the matrix is
>>>     > copied after it is read into memory. Is there a way to avoid this
>>>     > behaviour? Currently, it is a clear bottleneck in my code.
>>>     >
>>>     > I tried setting the size of the matrix and to explicitly
>>>     preallocate the
>>>     > necessary NNZ (with A.setSizes(dim) and
>>>     A.setPreallocationNNZ(nnz),
>>>     > respectively) before loading, but that didn't help.
>>>     >
>>>     > As mentioned above, I'm using petsc4py together with
>>>     PETSc-3.16 on a
>>>     > Linux workstation.
>>>     >
>>>     > Best regards,
>>>     > Michael Werner
>>>     >
>>>     > --
>>>     >
>>>     > ____________________________________________________
>>>     >
>>>     > Deutsches Zentrum für Luft- und Raumfahrt e.V. (DLR)
>>>     > Institut für Aerodynamik und Strömungstechnik | Bunsenstr. 10
>>>     | 37073 Göttingen
>>>     >
>>>     > Michael Werner
>>>     > Telefon 0551 709-2627 | Telefax 0551 709-2811 |
>>>     Michael.Werner at dlr.de <mailto:Michael.Werner at dlr.de>
>>>     > DLR.de <http://DLR.de>
>>>     >
>>>     >
>>>     >
>>>     >
>>>     >
>>>     >
>>>     >
>>>     >
>>>     >
>>>
>>>
>>>
>>> -- 
>>> What most experimenters take for granted before they begin their
>>> experiments is infinitely more interesting than any results to which
>>> their experiments lead.
>>> -- Norbert Wiener
>>>
>>> https://www.cse.buffalo.edu/~knepley/
>>> <http://www.cse.buffalo.edu/~knepley/>
>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20211007/2b91a641/attachment.html>