[petsc-users] MatAXPY: appearance of memory leak with DIFFERENT_NONZERO_PATTERN

Antoine De Blois antoine.deblois at aero.bombardier.com
Tue Apr 26 14:25:44 CDT 2016


Hi everyone,

I am using petsc-3.6.2 and I get an Out-of-memory error, from what I believe to be a leak from petsc. I am doing successive calls to MatAXPY during an optimization loop. I use MatAXPY to blend 2 matrices together; a linear combination of the my operator and its first-order PC
B = w*A + (1-w)*B

The matrix operator and PC does not have the same non-zeros patterns, hence the DIFFERENT_NONZERO_PATTERN option. In fact, the non-zeros of the PC are a subset of the non-zeros of A, but I found out about the SUBSET_NONZERO_PATTERN too late...

My matrices are pretty big (about 90 million unknown, with 33 non-zeros per row). So after the second call to MatAXPY, I already blow up the memory (on 256 cores). My code snippet would look like:

PETSC_Mat_create(&( IMP_Mat_dR2dW)   , 5, 0, 1 ,33,22);
PETSC_Mat_create(&( IMP_Mat_dR1dW_PC), 5, 0, 1, 19,5); // notice the different non-zeros patterns

while(1)
{
  MatZeroEntries( IMP_Mat_dR2dW);
   MatZeroEntries( IMP_Mat_dR1dW_PC);
   // fill the dR2dW matrix ....
   // ...
     ierr = MatSetValuesBlocked( IMP_Mat_dR2dW, nnz_W, dRdW_AD_blocked_cols, 1, &dRdW_AD_blocked_row, dRdW_AD_values, ADD_VALUES); // the dRdW_AD_values already contain the w factor
  // ....
  MatAssemblyBegin( IMP_Mat_dR2dW, MAT_FINAL_ASSEMBLY);
  MatAssemblyEnd( IMP_Mat_dR2dW, MAT_FINAL_ASSEMBLY);

  // fill the dR1dW_PC matrix ....
  // ...
   ierr = MatSetValuesBlocked( IMP_Mat_dR1dW_PC, nnz_W, dRdW_AD_blocked_cols, 1, &dRdW_AD_blocked_row, dRdW_AD_values, ADD_VALUES);
  // ...
  MatAssemblyBegin( IMP_Mat_dR1dW_PC, MAT_FINAL_ASSEMBLY);
MatAssemblyEnd( IMP_Mat_dR1dW_PC, MAT_FINAL_ASSEMBLY);

// blend the matrices
// Out-of-memory appears here
MatAXPY( IMP_Mat_dR1dW_PC,
              1-w,
               IMP_Mat_dR2dW,
              SAME_NONZERO_PATTERN);
        }
// KSPSetOperators
// KSPsolve
}

I looked up the .../petsc-3.6.2/src/mat/impls/baij/mpi/mpibaij.c at line 2120. It appears that the temporary matrix B is allocated, but never freed. Note that I have not run Valgrind to confirm this leak.
For now, I will make the non-zeros of the PC the same as the operator and use SAME_NONZERO_PATTERN.

Thank you for your time and efforts,
Antoine

PETSC error below:
[117]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[117]PETSC ERROR: Out of memory. This could be due to allocating
[117]PETSC ERROR: too large an object or bleeding by not properly
[117]PETSC ERROR: destroying unneeded objects.
[117]PETSC ERROR: Memory allocated 0 Memory used by process 8423399424
[117]PETSC ERROR: Try running with -malloc_dump or -malloc_log for info.
[117]PETSC ERROR: Memory requested 704476168
[117]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
[117]PETSC ERROR: Petsc Release Version 3.6.2, Oct, 02, 2015
[117]PETSC ERROR: /home/ad007804/CODES/FANSC_REPO_after_merge/fansc13.2.1_argus on a ARGUS_impi_opt named node1064 by ad007804 Tue Apr 26 09:34:52 2016
[117]PETSC ERROR: Configure options --CFLAGS=-axAVX --CXXFLAGS="-std=c++11  -DMPICH_IGNORE_CXX_SEEK   -DMPICH_IGNORE_CXX_SEEK" --FC=mpiifort --FFLAGS=-axAVX --download-hypre=/home/b0528796/libs/petsc_extpacks/hypre-2.10.1.tar.gz --FC=mpiifort --FFLAGS=-axAVX --download-metis=/home/b0528796/libs/petsc_extpacks/metis-5.1.0-p1.tar.gz --download-ml=/home/b0528796/libs/petsc_extpacks/ml-6.2-win.tar.gz --download-parmetis=/home/b0528796/libs/petsc_extpacks/parmetis-4.0.3-p1.tar.gz --download-suitesparse=/home/b0528796/libs/petsc_extpacks/SuiteSparse-4.4.3.tar.gz --download-superlu_dist --with-blas-lapack-lib=-mkl --with-cc=mpiicc --with-cxx=mpiicpc --with-debugging=no --download-mumps=/home/b0528796/libs/petsc_extpacks/MUMPS_5.0.0-p1.tar.gz --with-scalapack-lib="-L/gpfs/fs1/intel/mkl/lib/intel64 -lmkl_scalapack_lp64 -lmkl_blacs_intelmpi_lp64" --with-scalapack-include=/gpfs/fs1/intel/mkl/include
[117]PETSC ERROR: #1 MatAXPY_BasicWithPreallocation() line 117 in /gpfs/fs2/aero/SOFTWARE/FLOW_SOLVERS/FANSC/EXT_LIB/petsc-3.6.2/src/mat/utils/axpy.c
[117]PETSC ERROR: #2 MatAXPY_BasicWithPreallocation() line 117 in /gpfs/fs2/aero/SOFTWARE/FLOW_SOLVERS/FANSC/EXT_LIB/petsc-3.6.2/src/mat/utils/axpy.c
[117]PETSC ERROR: #3 MatAXPY_MPIBAIJ() line 2120 in /gpfs/fs2/aero/SOFTWARE/FLOW_SOLVERS/FANSC/EXT_LIB/petsc-3.6.2/src/mat/impls/baij/mpi/mpibaij.c
[117]PETSC ERROR: #4 MatAXPY() line 39 in /gpfs/fs2/aero/SOFTWARE/FLOW_SOLVERS/FANSC/EXT_LIB/petsc-3.6.2/src/mat/utils/axpy.c

Antoine DeBlois
Spécialiste ingénierie, MDO lead / Engineering Specialist, MDO lead
Aéronautique / Aerospace
514-855-5001, x 50862
antoine.deblois at aero.bombardier.com<mailto:antoine.deblois at aero.bombardier.com>

2351 Blvd Alfred-Nobel
Montreal, Qc
H4S 1A9

[Description : http://signatures.ca.aero.bombardier.net/eom_logo_164x39_fr.jpg]
CONFIDENTIALITY NOTICE - This communication may contain privileged or confidential information.
If you are not the intended recipient or received this communication by error, please notify the sender
and delete the message without copying

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160426/99b64a01/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.jpg
Type: image/jpeg
Size: 4648 bytes
Desc: image001.jpg
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160426/99b64a01/attachment-0001.jpg>


More information about the petsc-users mailing list