[petsc-users] Out of memory during MatAssemblyBegin
Barry Smith
bsmith at mcs.anl.gov
Mon Jan 24 15:23:37 CST 2011
On Jan 24, 2011, at 3:08 PM, Raeth, Peter wrote:
> Am running out of memory while using MatAssemblyBegin on a dense matrix that spans several processors. My calculations show that the matrices I am using do not require more than 25% of available memory.
>
> Different about this matrix compared to the others is that the program runs out of memory after the matrix has been populated by a single process, rather than by multiple processes. Used MatSetValues. Since the values are held in cache until MatAssemblyEnd is called (as I understand things), is it possible that using one process to populate the entire matrix is causing this problem?
Yes, absolutely, this is a terrible non-scalable way of filling a parallel matrix. You can fake it by calling MatAssemblyBegin/End() repeatedly with the flag MAT_FLUSH_ASSEMBLY to keep the stash from getting too big. But you really need a much better way of setting values into the matrix. How are these "brought in row by row" matrix entries generated?
Barry
> The data is brought in only row by row for the population process. All buffer memory is cleared before the call to MatAssemblyBegin.
>
> The error dump contains:
>
> mpirun -prefix [%g] -np 256 Peter.x
> [0] [0]PETSC ERROR: --------------------- Error Message ------------------------------------
> [0] [0]PETSC ERROR: Out of memory. This could be due to allocating
> [0] [0]PETSC ERROR: too large an object or bleeding by not properly
> [0] [0]PETSC ERROR: destroying unneeded objects.
> [0] [0]PETSC ERROR: Memory allocated 1372407920 Memory used by process -122585088
> [0] [0]PETSC ERROR: Try running with -malloc_dump or -malloc_log for info.
> [0] [0]PETSC ERROR: Memory requested 18446744071829395456!
> [0] [0]PETSC ERROR: ------------------------------------------------------------------------
> [0] [0]PETSC ERROR: Petsc Release Version 3.1.0, Patch 6, Tue Nov 16 17:02:32 CST 2010
> [0] [0]PETSC ERROR: See docs/changes/index.html for recent updates.
> [0] [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
> [0] [0]PETSC ERROR: See docs/index.html for manual pages.
> [0] [0]PETSC ERROR: ------------------------------------------------------------------------
> [0] [0]PETSC ERROR: Peter.x on a linux-int named hawk-6 by praeth Mon Jan 24 15:44:28 2011
> [0] [0]PETSC ERROR: Libraries linked from /default/praeth/MATH/petsc-3.1-p6/linux-intel-g/lib
> [0] [0]PETSC ERROR: Configure run at Tue Dec 21 08:45:25 2010
> [0] [0]PETSC ERROR: Configure options --download-superlu=1 --download-parmetis=1 --download-superlu_dist=1 --with-debugging=1 --with-error-checking=1 -PETSC_ARCH=linux-intel-g --with-fc="ifort -lmpi" --with-cc="icc -lmpi" --with-gnu-compilers=false
> [0] [0]PETSC ERROR: ------------------------------------------------------------------------
> [0] [0]PETSC ERROR: PetscMallocAlign() line 49 in src/sys/memory/mal.c
> [0] [0]PETSC ERROR: PetscTrMallocDefault() line 192 in src/sys/memory/mtr.c
> [0] [0]PETSC ERROR: MatStashScatterBegin_Private() line 510 in src/mat/utils/matstash.c
> [0] [0]PETSC ERROR: MatAssemblyBegin_MPIDense() line 286 in src/mat/impls/dense/mpi/mpidense.c
> [0] [0]PETSC ERROR: MatAssemblyBegin() line 4564 in src/mat/interface/matrix.c
> [0] [0]PETSC ERROR: User provided function() line 195 in "unknowndirectory/"Peter.c
> [-1] MPI: MPI_COMM_WORLD rank 0 has terminated without calling MPI_Finalize()
> [-1] MPI: aborting job
> exit
>
> Had tried to use the suggestion to employ -malloc_dump or -malloc_log but do not see any result from the batch run.
>
> Thank you all for any insights you can offer.
>
>
> Best,
>
> Peter.
>
More information about the petsc-users
mailing list