[petsc-users] petsc4py - Spike in memory usage when loading a matrix in parallel

Thu Oct 7 08:50:12 CDT 2021

Hello,

I noticed that there is a peak in memory consumption when I load an
existing matrix into PETSc. The matrix is previously created by an
external program and saved in the PETSc binary format.
The code I'm using in petsc4py is simple:

viewer = PETSc.Viewer().createBinary(<path/to/existing/matrix>, "r",
comm=PETSc.COMM_WORLD)
A = PETSc.Mat().create(comm=PETSc.COMM_WORLD)
A.load(viewer)

When I run this code in serial, the memory consumption of the process is
about 50GB RAM, similar to the file size of the saved matrix. However,
if I run the code in parallel, for a few seconds the memory consumption
of the process doubles to around 100GB RAM, before dropping back down to
around 50GB RAM. So it seems as if, for some reason, the matrix is
copied after it is read into memory. Is there a way to avoid this
behaviour? Currently, it is a clear bottleneck in my code.

I tried setting the size of the matrix and to explicitly preallocate the
necessary NNZ (with A.setSizes(dim) and A.setPreallocationNNZ(nnz),
respectively) before loading, but that didn't help.

As mentioned above, I'm using petsc4py together with PETSc-3.16 on a
Linux workstation.

Best regards,
Michael Werner

-- 

____________________________________________________

Deutsches Zentrum für Luft- und Raumfahrt e.V. (DLR)
Institut für Aerodynamik und Strömungstechnik | Bunsenstr. 10 | 37073 Göttingen

Michael Werner 
Telefon 0551 709-2627 | Telefax 0551 709-2811 | Michael.Werner at dlr.de
DLR.de