Out of Memory Error.

Matthew Knepley knepley at gmail.com
Sat Sep 19 14:24:03 CDT 2009


On Sat, Sep 19, 2009 at 2:12 PM, Ryan Yan <vyan2000 at gmail.com> wrote:

> Hi All,
> My application code is reading PETSc binary files to obtain the information
> about a linear system and then solve it in parallel.
>
> The code works well  for median size problem. Now, I am testing a largest
> case requested by our custom on *one* processor. I got the following errors.
>
> It looks like that error happenned when PETSc is requesting an malloc of
> size "[0]PETSC ERROR: Memory requested 44784088!", but I did see there are
> PETSc routines the use even more memory than "44784088",
> for instance, "[0] 46 5520000 ISGetIndices_Stride()". So can I guess error
> is caused by the hardware memory limitation?
>

You are running out of memory. If you want to run bigger problems, you will
have to use more nodes.

  Matt


> The code was running on MPIS machine with 6 CPUs one each Node.
> The code broke for 1 Node with 1 process
>                                  for 1 Node with 2 process
>                                  for 1 Node with 6 process
>
> But the code succeed for 2 Node with 2 process.
>                                   for 2 Node with 4 process.
> The code also succeed when Node number is big than 2.
>
> Is this another indicator of the hardware limitation?
>
> Thanks a lot,
>
> Yan
>
>
> $ srun -p sci-comp -N 1 -n 1 ./rpisolve_25_field -ksp_monitor_true_residual
> -log_summary  -malloc_dump -malloc_log >& out.rpisolve.N1.n1
> $ cat out.rpisolve.N1.n1
>
>
>
>
>
> breakpoint 1
> breakpoint 2
> breakpoint 750000
> [0]PETSC ERROR: --------------------- Error Message
> ------------------------------------
> [0]PETSC ERROR: Out of memory. This could be due to allocating
> [0]PETSC ERROR: too large an object or bleeding by not properly
> [0]PETSC ERROR: destroying unneeded objects.
> [0] Maximum memory PetscMalloc()ed 3172769832 maximum size of entire
> process 0
> [0] Memory usage sorted by function
> [0] 2 3216 ClassPerfLogCreate()
> [0] 2 1616 ClassRegLogCreate()
> [0] 2 6416 EventPerfLogCreate()
> [0] 1 12800 EventPerfLogEnsureSize()
> [0] 2 1616 EventRegLogCreate()
> [0] 1 3200 EventRegLogRegister()
> [0] 92 11960 ISCreateBlock()
> [0] 292 36792 ISCreateStride()
> [0] 46 5520000 ISGetIndices_Stride()
> [0] 78 21632 KSPCreate()
> [0] 1 200 KSPCreate_FGMRES()
> [0] 26 416 KSPDefaultConvergedCreate()
> [0] 6 17600 KSPSetUp_FGMRES()
> [0] 475 180880 MatCreate()
> [0] 24 3648 MatCreate_MPIAIJ()
> [0] 71 22152 MatCreate_SeqAIJ()
> [0] 1 1504 MatGetRow_MPIAIJ()
> [0] 23 368 MatGetSubMatrices_MPIAIJ()
> [0] 690 140770488 MatGetSubMatrices_MPIAIJ_Local()
> [0] 22 5280176 MatGetSubMatrix_MPIAIJ()
> [0] 7 1497800024 MatLoad_MPIAIJ()
> [0] 68 13920000 MatMarkDiagonal_SeqAIJ()
> [0] 138 1236969200 MatSeqAIJSetPreallocation_SeqAIJ()
> [0] 23 184 MatSetUpMultiply_MPIAIJ()
> [0] 24 192 MatStashCreate_Private()
> [0] 138 1288 MatStashScatterBegin_Private()
> [0] 23 184 Mat_CheckCompressedRow()
> [0] 45 8280360 Mat_CheckInode()
> [0] 78 14768 PCCreate()
> [0] 1 120 PCCreate_FieldSplit()
> [0] 2 208 PCFieldSplitSetDefaults()
> [0] 50 2400 PCFieldSplitSetFields_FieldSplit()
> [0] 1 104 PCSetFromOptions_FieldSplit()
> [0] 1 200 PCSetUp_FieldSplit()
> [0] 3 24 PetscCommDuplicate()
> [0] 1768 84864 PetscFListAdd()
> [0] 46 368 PetscGatherNumberOfMessages()
> [0] 237 1896 PetscMapSetUp()
> [0] 4 32 PetscMaxSum()
> [0] 22 5984 PetscOListAdd()
> [0] 75 4800 PetscOptionsCreate_Private()
> [0] 4 96 PetscOptionsGetEList()
> [0] 6 384000 PetscOptionsInsertFile()
> [0] 75 600 PetscOptionsInt()
> [0] 92 736 PetscPostIrecvInt()
> [0] 46 368 PetscPostIrecvScalar()
> [0] 0 32 PetscPushSignalHandler()
> [0] 4570 130832 PetscStrallocpy()
> [0] 69 16924048 PetscTableCreate()
> [0] 1 16 PetscViewerASCIIMonitorCreate()
> [0] 1 16 PetscViewerASCIIOpen()
> [0] 12 1952 PetscViewerCreate()
> [0] 1 56 PetscViewerCreate_ASCII()
> [0] 3 192 PetscViewerCreate_Binary()
> [0] 2 528 StackCreate()
> [0] 2 1008 StageLogCreate()
> [0] 2 16 VecAssemblyBegin_MPI()
> [0] 236 74104 VecCreate()
> [0] 49 78003168 VecCreate_MPI_Private()
> [0] 23 552 VecCreate_Seq_Private()
> [0] 2 80 VecDuplicateVecs_Default()
> [0] 92 11224 VecScatterCreate()
> [0] 72 576 VecStashCreate_Private()
> [0] 28 1056 VecStashScatterBegin_Private()
> [0]PETSC ERROR: Memory requested 44784088!
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: Petsc Release Version 3.0.0, Patch 5, Mon Apr 13 09:15:37
> CDT 2009
> [0]PETSC ERROR: See docs/changes/index.html for recent updates.
> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
> [0]PETSC ERROR: See docs/index.html for manual pages.
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR:
> /tmp/lustre/home/yy2250/local/PETSc/petsc-3.0.0-p5/src/ksp/ksp/examples/tutorials/ttt_5fld/./rpisolve_25_field
> on a O-hypre-n named sci-m0n0.scsystem by yy2250 Sat Sep 19 14:37:43 2009
> [0]PETSC ERROR: Libraries linked from
> /home/yy2250/local/PETSc/petsc-test-3-p5/O-hypre-nodebug/lib
> [0]PETSC ERROR: Configure run at Tue Jul 21 15:19:41 2009
> [0]PETSC ERROR: Configure options --with-cc=mpicc --with-fc=mpif77
> --with-mpiexec=srun --with-debugging=0 --with-fortran-kernels=generic
> --with-shared=0
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: PetscMallocAlign() line 61 in src/sys/memory/mal.c
> [0]PETSC ERROR: PetscTrMallocDefault() line 194 in src/sys/memory/mtr.c
> [0]PETSC ERROR: MatSeqAIJSetPreallocation_SeqAIJ() line 2986 in
> src/mat/impls/aij/seq/aij.c
> [0]PETSC ERROR: MatSeqAIJSetPreallocation() line 2928 in
> src/mat/impls/aij/seq/aij.c
> [0]PETSC ERROR: MatGetSubMatrices_MPIAIJ_Local() line 1267 in
> src/mat/impls/aij/mpi/mpiov.c
> [0]PETSC ERROR: MatGetSubMatrices_MPIAIJ() line 787 in
> src/mat/impls/aij/mpi/mpiov.c
> [0]PETSC ERROR: MatGetSubMatrices() line 5524 in src/mat/interface/matrix.c
> [0]PETSC ERROR: MatGetSubMatrix_MPIAIJ() line 3069 in
> src/mat/impls/aij/mpi/mpiaij.c
> [0]PETSC ERROR: MatGetSubMatrix() line 6212 in src/mat/interface/matrix.c
> [0]PETSC ERROR: PCSetUp_FieldSplit() line 285 in
> src/ksp/pc/impls/fieldsplit/fieldsplit.c
> [0]PETSC ERROR: PCSetUp() line 794 in src/ksp/pc/interface/precon.c
> [0]PETSC ERROR: KSPSetUp() line 237 in src/ksp/ksp/interface/itfunc.c
> [0]PETSC ERROR: KSPSolve() line 353 in src/ksp/ksp/interface/itfunc.c
> [0]PETSC ERROR: main() line 246 in
> src/ksp/ksp/examples/tutorials/rpisolve_25_field.c
> application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0
> In: PMI_Abort(1, application called MPI_Abort(MPI_COMM_WORLD, 1) - process
> 0)
> srun: error: task 0: Exited with exit code 1
>
>


-- 
What most experimenters take for granted before they begin their experiments
is infinitely more interesting than any results to which their experiments
lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20090919/5dccdc5b/attachment.htm>


More information about the petsc-users mailing list