[petsc-users] KSPSolve + MUMPS memory growth issues

Thu Sep 5 12:55:47 CDT 2024

On Thu, Sep 5, 2024 at 1:40 PM Corbijn van Willenswaard, Lars (UT) via
petsc-users <petsc-users at mcs.anl.gov> wrote:

> Dear PETSc,
>
> For the last months I’ve struggled with a solver that I wrote for a FEM
> eigenvalue problem running out of memory. I’ve traced it to KSPSolve +
> MUMPS being the issue, but I'm getting stuck on digging deeper.
>
> The reason I suspect the KSPSolve/MUMPS is that when commenting out the
> KSPSolve the memory stays constant while running the rest of the algorithm.
> Of course, the algorithm also converges to a different result in this
> setup. When changing the KSP statement to
> for(int i = 0; i < 100000000; i++) KSPSolve(A_, vec1_, vec2_);
> the memory grows faster than when running the algorithm. Logging shows
> that the program never the terminating i=100M. Measuring the memory growth
> using ps (started debugging before I knew of PETSc's features) I see a
> growth in the RSS on a single compute node of up to 300MB/min for this
> artificial case. Real cases grow more like 60MB/min/node, which causes a
> kill due to memory exhaustion after about 2-3 days.
>
> Locally (Mac) I've been able to reproduce this both with 6 MPI processes
> and with a single one. Instrumenting the code to show differences in
> PetscMemoryGetCurrentUsage (full code below), shows that the memory
> increases every step at the start, but also does at later iterations (small
> excerpt from the output):
> rank step        memory (increase since prev step)
>  0   6544 current 39469056(  8192)
>  0   7086 current 39477248(  8192)
>  0   7735 current 39497728( 20480)
>  0   9029 current 39501824(  4096)
> A similar output is visible in a run with 6 ranks, where there does not
> seem to be a pattern as to which of the ranks increases at which step.
> (Note I've checked PetscMallocGetCurrentUsage, but that is constant)
>
> Switching the solver to petsc's own solver on a single rank does not show
> a memory increase after the first solve. Changing the solve to overwrite
> the vector will result in a few increases after the first solve, but these
> do not seem to repeat. So, changes like VecCopy(vec2, vec1_); KSPSolve(A_,
> vec1_, vec1_);.
>
> Does anyone have an idea on how to further dig into this problem?
>

I think the best way is to construct the simplest code that reproduces your
problem. For example, we could save your matrix in a binary file

  -ksp_view_mat binary:mat.bin

and then use a very simple code:

#include <petsc.h>

int main(int argc, char **argv)
{
  PetscViewer viewer;
  Mat         A;
  Vec         b, x;

  PetscCall(PetscInitialize(&argc, &argv, NULL, NULL));
  PetscCall(PetscViewerBinaryOpen(PETSC_COMM_WORLD, "mat.bin",
PETSC_MODE_READ, &viewer));
  PetscCall(MatLoad(A, viewer));
  PetscCall(PetscViewerDestroy(&viewer));
  PetscCall(MatCreateVecs(A, &x, &b));
  PetscCall(VecSet(b, 1.));

  PetscCall(KSPCreate(PETSC_COMM_WORLD, &ksp));
  PetscCall(KSPSetOperators(ksp, A, A));
  PetscCall(KSPSetFromOptions(ksp));
  for (PetscInt i = 0; i < 100000; ++i) PetscCall(KSPSolve(ksp, b, x));
  PetscCall(KSPDestroy(&ksp));

  PetscCall(MatDestroy(&A));
  PetscCall(VecDestroy(&b));
  PetscCall(VecDestroy(&x));
  PetscCall(PetscFinalize());
  return(0);
}

and see if you get memory increase.

  Thanks,

    Matt

> Kind regards,
> Lars Corbijn
>
>
> Instrumentation:
>
> PetscLogDouble lastCurrent, current;
> int rank;
> MPI_Comm_rank(PETSC_COMM_WORLD, &rank);
> for(int i = 0; i < 100000000; ++i) {
>     PetscMemoryGetCurrentUsage(&lastCurrent);
>     KSPSolve(A_, vec1_, vec2_);
>     PetscMemoryGetCurrentUsage(&current);
>     if(current != lastCurrent) {
>         std::cout << std::setw(2) << rank << " " << std::setw(6) << i
>                   << " current " << std::setw(8) << (int) current <<
> std::right
>                   << "(" << std::setw(6) << (int)(current - lastCurrent)
> << ")"
>                   << std::endl;
>     }
>     lastCurrent = current;
> }
>
>
> Matrix details
> The matrix A in question is created from a complex valued matrix C_ (type
> mataij) using the following code (modulo renames). Theoretically it should
> be a Laplacian with phase-shift periodic boundary conditions
> MatHermitianTranspose(C_, MAT_INITIAL_MATRIX, &Y_);
> MatProductCreate(C_, Y_, NULL, & A_);
> MatProductSetType(A_, MATPRODUCT_AB);
> MatProductSetFromOptions(A_);
> MatProductSymbolic(A_);
> MatProductNumeric(A_);
>
> Petsc arguments: -log_view_memory -log_view :petsc.log -ksp_type preonly
> -pc_type lu -pc_factor_mat_solver_type mumps -bv_matmult vecs -memory_view
>
>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!aOL-SsOXXPOKEgMX-Lf2zBM59ObfrSKQMpa_bh8gN0Lwam9yu6vKE1ZBQW3nM26gVAYbVnMsu_zXglv_gOeR$  <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!aOL-SsOXXPOKEgMX-Lf2zBM59ObfrSKQMpa_bh8gN0Lwam9yu6vKE1ZBQW3nM26gVAYbVnMsu_zXgpJ4TMFB$ >
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20240905/264dd8f1/attachment-0001.html>