Out of Memory Error.

Ryan Yan vyan2000 at gmail.com
Sat Sep 19 14:12:17 CDT 2009


Hi All,
My application code is reading PETSc binary files to obtain the information
about a linear system and then solve it in parallel.

The code works well  for median size problem. Now, I am testing a largest
case requested by our custom on *one* processor. I got the following errors.

It looks like that error happenned when PETSc is requesting an malloc of
size "[0]PETSC ERROR: Memory requested 44784088!", but I did see there are
PETSc routines the use even more memory than "44784088",
for instance, "[0] 46 5520000 ISGetIndices_Stride()". So can I guess error
is caused by the hardware memory limitation?


The code was running on MPIS machine with 6 CPUs one each Node.
The code broke for 1 Node with 1 process
                                 for 1 Node with 2 process
                                 for 1 Node with 6 process

But the code succeed for 2 Node with 2 process.
                                  for 2 Node with 4 process.
The code also succeed when Node number is big than 2.

Is this another indicator of the hardware limitation?

Thanks a lot,

Yan


$ srun -p sci-comp -N 1 -n 1 ./rpisolve_25_field -ksp_monitor_true_residual
-log_summary  -malloc_dump -malloc_log >& out.rpisolve.N1.n1
$ cat out.rpisolve.N1.n1





breakpoint 1
breakpoint 2
breakpoint 750000
[0]PETSC ERROR: --------------------- Error Message
------------------------------------
[0]PETSC ERROR: Out of memory. This could be due to allocating
[0]PETSC ERROR: too large an object or bleeding by not properly
[0]PETSC ERROR: destroying unneeded objects.
[0] Maximum memory PetscMalloc()ed 3172769832 maximum size of entire process
0
[0] Memory usage sorted by function
[0] 2 3216 ClassPerfLogCreate()
[0] 2 1616 ClassRegLogCreate()
[0] 2 6416 EventPerfLogCreate()
[0] 1 12800 EventPerfLogEnsureSize()
[0] 2 1616 EventRegLogCreate()
[0] 1 3200 EventRegLogRegister()
[0] 92 11960 ISCreateBlock()
[0] 292 36792 ISCreateStride()
[0] 46 5520000 ISGetIndices_Stride()
[0] 78 21632 KSPCreate()
[0] 1 200 KSPCreate_FGMRES()
[0] 26 416 KSPDefaultConvergedCreate()
[0] 6 17600 KSPSetUp_FGMRES()
[0] 475 180880 MatCreate()
[0] 24 3648 MatCreate_MPIAIJ()
[0] 71 22152 MatCreate_SeqAIJ()
[0] 1 1504 MatGetRow_MPIAIJ()
[0] 23 368 MatGetSubMatrices_MPIAIJ()
[0] 690 140770488 MatGetSubMatrices_MPIAIJ_Local()
[0] 22 5280176 MatGetSubMatrix_MPIAIJ()
[0] 7 1497800024 MatLoad_MPIAIJ()
[0] 68 13920000 MatMarkDiagonal_SeqAIJ()
[0] 138 1236969200 MatSeqAIJSetPreallocation_SeqAIJ()
[0] 23 184 MatSetUpMultiply_MPIAIJ()
[0] 24 192 MatStashCreate_Private()
[0] 138 1288 MatStashScatterBegin_Private()
[0] 23 184 Mat_CheckCompressedRow()
[0] 45 8280360 Mat_CheckInode()
[0] 78 14768 PCCreate()
[0] 1 120 PCCreate_FieldSplit()
[0] 2 208 PCFieldSplitSetDefaults()
[0] 50 2400 PCFieldSplitSetFields_FieldSplit()
[0] 1 104 PCSetFromOptions_FieldSplit()
[0] 1 200 PCSetUp_FieldSplit()
[0] 3 24 PetscCommDuplicate()
[0] 1768 84864 PetscFListAdd()
[0] 46 368 PetscGatherNumberOfMessages()
[0] 237 1896 PetscMapSetUp()
[0] 4 32 PetscMaxSum()
[0] 22 5984 PetscOListAdd()
[0] 75 4800 PetscOptionsCreate_Private()
[0] 4 96 PetscOptionsGetEList()
[0] 6 384000 PetscOptionsInsertFile()
[0] 75 600 PetscOptionsInt()
[0] 92 736 PetscPostIrecvInt()
[0] 46 368 PetscPostIrecvScalar()
[0] 0 32 PetscPushSignalHandler()
[0] 4570 130832 PetscStrallocpy()
[0] 69 16924048 PetscTableCreate()
[0] 1 16 PetscViewerASCIIMonitorCreate()
[0] 1 16 PetscViewerASCIIOpen()
[0] 12 1952 PetscViewerCreate()
[0] 1 56 PetscViewerCreate_ASCII()
[0] 3 192 PetscViewerCreate_Binary()
[0] 2 528 StackCreate()
[0] 2 1008 StageLogCreate()
[0] 2 16 VecAssemblyBegin_MPI()
[0] 236 74104 VecCreate()
[0] 49 78003168 VecCreate_MPI_Private()
[0] 23 552 VecCreate_Seq_Private()
[0] 2 80 VecDuplicateVecs_Default()
[0] 92 11224 VecScatterCreate()
[0] 72 576 VecStashCreate_Private()
[0] 28 1056 VecStashScatterBegin_Private()
[0]PETSC ERROR: Memory requested 44784088!
[0]PETSC ERROR:
------------------------------------------------------------------------
[0]PETSC ERROR: Petsc Release Version 3.0.0, Patch 5, Mon Apr 13 09:15:37
CDT 2009
[0]PETSC ERROR: See docs/changes/index.html for recent updates.
[0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
[0]PETSC ERROR: See docs/index.html for manual pages.
[0]PETSC ERROR:
------------------------------------------------------------------------
[0]PETSC ERROR:
/tmp/lustre/home/yy2250/local/PETSc/petsc-3.0.0-p5/src/ksp/ksp/examples/tutorials/ttt_5fld/./rpisolve_25_field
on a O-hypre-n named sci-m0n0.scsystem by yy2250 Sat Sep 19 14:37:43 2009
[0]PETSC ERROR: Libraries linked from
/home/yy2250/local/PETSc/petsc-test-3-p5/O-hypre-nodebug/lib
[0]PETSC ERROR: Configure run at Tue Jul 21 15:19:41 2009
[0]PETSC ERROR: Configure options --with-cc=mpicc --with-fc=mpif77
--with-mpiexec=srun --with-debugging=0 --with-fortran-kernels=generic
--with-shared=0
[0]PETSC ERROR:
------------------------------------------------------------------------
[0]PETSC ERROR: PetscMallocAlign() line 61 in src/sys/memory/mal.c
[0]PETSC ERROR: PetscTrMallocDefault() line 194 in src/sys/memory/mtr.c
[0]PETSC ERROR: MatSeqAIJSetPreallocation_SeqAIJ() line 2986 in
src/mat/impls/aij/seq/aij.c
[0]PETSC ERROR: MatSeqAIJSetPreallocation() line 2928 in
src/mat/impls/aij/seq/aij.c
[0]PETSC ERROR: MatGetSubMatrices_MPIAIJ_Local() line 1267 in
src/mat/impls/aij/mpi/mpiov.c
[0]PETSC ERROR: MatGetSubMatrices_MPIAIJ() line 787 in
src/mat/impls/aij/mpi/mpiov.c
[0]PETSC ERROR: MatGetSubMatrices() line 5524 in src/mat/interface/matrix.c
[0]PETSC ERROR: MatGetSubMatrix_MPIAIJ() line 3069 in
src/mat/impls/aij/mpi/mpiaij.c
[0]PETSC ERROR: MatGetSubMatrix() line 6212 in src/mat/interface/matrix.c
[0]PETSC ERROR: PCSetUp_FieldSplit() line 285 in
src/ksp/pc/impls/fieldsplit/fieldsplit.c
[0]PETSC ERROR: PCSetUp() line 794 in src/ksp/pc/interface/precon.c
[0]PETSC ERROR: KSPSetUp() line 237 in src/ksp/ksp/interface/itfunc.c
[0]PETSC ERROR: KSPSolve() line 353 in src/ksp/ksp/interface/itfunc.c
[0]PETSC ERROR: main() line 246 in
src/ksp/ksp/examples/tutorials/rpisolve_25_field.c
application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0
In: PMI_Abort(1, application called MPI_Abort(MPI_COMM_WORLD, 1) - process
0)
srun: error: task 0: Exited with exit code 1
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20090919/14133a37/attachment.htm>


More information about the petsc-users mailing list