[petsc-users] SNESComputeJacobianDefaultColor use too much memory

Rongliang Chen rongliang.chan at gmail.com
Thu Feb 18 20:07:45 CST 2016


Hi Matt,

Thanks for your reply. The job was killed in the function 
DMPlexInterpolateFaces_Internal and the administrator told me that the 
reason is out of memory. The error messages are followed:

---------------------------
[0]PETSC ERROR: 
------------------------------------------------------------------------
[0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, 
probably memory access out of range
[0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
[0]PETSC ERROR: or see 
http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC 
ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to 
find memory corruption errors
[0]PETSC ERROR: likely location of problem given in stack below
[0]PETSC ERROR: ---------------------  Stack Frames 
------------------------------------
[0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
[0]PETSC ERROR:       INSTEAD the line number of the start of the function
[0]PETSC ERROR:       is given.
[0]PETSC ERROR: [0] PetscHashIJKLPut line 1261 
/home/rlchen/soft/petsc-3.5.2/include/../src/sys/utils/hash.h
[0]PETSC ERROR: [0] DMPlexInterpolateFaces_Internal line 154 
/home/rlchen/soft/petsc-3.5.2/src/dm/impls/plex/plexinterpolate.c
[0]PETSC ERROR: [0] DMPlexInterpolate line 333 
/home/rlchen/soft/petsc-3.5.2/src/dm/impls/plex/plexinterpolate.c
[0]PETSC ERROR: [0] DMPlexCreateExodusNwtun line 98 
/home/rlchen/soft/3D_fluid/FiniteVolumeMethod/PETScCodes/codefor3.5/src/plexexodusii.c
[0]PETSC ERROR: [0] DMPlexCreateExodusFromFileNwtun line 44 
/home/rlchen/soft/3D_fluid/FiniteVolumeMethod/PETScCodes/codefor3.5/src/plexexodusii.c
[0]PETSC ERROR: [0] CreateMesh line 19 
/home/rlchen/soft/3D_fluid/FiniteVolumeMethod/PETScCodes/codefor3.5/SetupMeshes.c
[0]PETSC ERROR: --------------------- Error Message 
--------------------------------------------------------------
[0]PETSC ERROR: Signal received
[0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html 
for trouble shooting.
[0]PETSC ERROR: Petsc Release Version 3.5.2, unknown
[0]PETSC ERROR: ./Nwtun on a 64bit-debug named leszek-ThinkStation-C20 
by rlchen Thu Feb 18 23:07:32 2016
[0]PETSC ERROR: Configure options --download-fblaslapack 
--download-blacs --download-scalapack --download-metis 
--download-parmetis --download-exodusii --download-netcdf 
--download-hdf5 --with-64-bit-indices --with-c2html=0 --with-mpi=1 
--with-debugging=1 --with-shared-libraries=0
[0]PETSC ERROR: #1 User provided function() line 0 in  unknown file
application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0

=====================================================================================
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   EXIT CODE: 15104
=   CLEANING UP REMAINING PROCESSES
=   YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
=====================================================================================
----------------------------------

Best,
Rongliang

On 02/19/2016 09:58 AM, Matthew Knepley wrote:
> On Thu, Feb 18, 2016 at 7:37 PM, Rongliang Chen 
> <rongliang.chan at gmail.com <mailto:rongliang.chan at gmail.com>> wrote:
>
>     Hi Barry,
>
>     When I increase the size of the mesh, another memory issue comes
>     out. The DMPlexInterpolateFaces_Internal takes too much memory.
>     The massif output is attached. Any suggestions for this? Thanks.
>
>
> It is creating parts of the mesh. I do not believe it allocates much 
> temp storage. Is that what you see?
>
>   Matt
>
>     Best,
>     Rongliang
>
>     On 02/18/2016 04:11 AM, Barry Smith wrote:
>
>            Hmm, Matt, this is nuts. This
>
>           if (size > 1) {
>              /* create a sequential iscoloring on all processors */
>              ierr = MatGetSeqNonzeroStructure(mat,&mat_seq);CHKERRQ(ierr);
>            }
>
>            It sequentializes the graph.
>
>             It looks like the only parallel coloring for Jacobians is
>         MATCOLORINGGREEDY? Try that.
>
>             Maybe that should be the default?
>
>             Barry
>
>
>             On Feb 17, 2016, at 1:49 AM, Rongliang Chen
>             <rongliang.chan at gmail.com
>             <mailto:rongliang.chan at gmail.com>> wrote:
>
>             Dear Barry,
>
>             The massif output for the large case is attached. It shows
>             that the job was kill in the function
>             MatGetSubMatrix_MPIAIJ_All(). Any suggestions?
>
>             --------------------
>             [8]PETSC ERROR: #1 MatGetSubMatrix_MPIAIJ_All() line 622
>             in /home/rlchen/soft/petsc-3.5.2/src/mat/impls/aij/mpi/mpiov.c
>             [8]PETSC ERROR: #2 MatGetSeqNonzeroStructure_MPIAIJ() line
>             3010 in
>             /home/rlchen/soft/petsc-3.5.2/src/mat/impls/aij/mpi/mpiaij.c
>             [8]PETSC ERROR: #3 MatGetSeqNonzeroStructure() line 6487
>             in /home/rlchen/soft/petsc-3.5.2/src/mat/interface/matrix.c
>             [8]PETSC ERROR: #4 MatColoringApply_SL() line 78 in
>             /home/rlchen/soft/petsc-3.5.2/src/mat/color/impls/minpack/color.c
>             [8]PETSC ERROR: #5 MatColoringApply() line 379 in
>             /home/rlchen/soft/petsc-3.5.2/src/mat/color/interface/matcoloring.c
>             [8]PETSC ERROR: #6 SNESComputeJacobianDefaultColor() line
>             71 in
>             /home/rlchen/soft/petsc-3.5.2/src/snes/interface/snesj2.c
>             [8]PETSC ERROR: #7 FormJacobian() line 58 in
>             /home/rlchen/soft/3D_fluid/FiniteVolumeMethod/PETScCodes/codefor3.5/SetupJacobian.c
>             [8]PETSC ERROR: #8 SNESComputeJacobian() line 2193 in
>             /home/rlchen/soft/petsc-3.5.2/src/snes/interface/snes.c
>             [8]PETSC ERROR: #9 SNESSolve_NEWTONLS() line 230 in
>             /home/rlchen/soft/petsc-3.5.2/src/snes/impls/ls/ls.c
>             [8]PETSC ERROR: #10 SNESSolve() line 3743 in
>             /home/rlchen/soft/petsc-3.5.2/src/snes/interface/snes.c
>             [8]PETSC ERROR: #11 SolveTimeDependent() line 758 in
>             /home/rlchen/soft/3D_fluid/FiniteVolumeMethod/PETScCodes/codefor3.5/Nwtun.c
>             [8]PETSC ERROR: #12 main() line 417 in
>             /home/rlchen/soft/3D_fluid/FiniteVolumeMethod/PETScCodes/codefor3.5/Nwtun.c
>             -------------------
>
>             Best,
>             Rongliang
>
>             On 02/17/2016 02:09 PM, Barry Smith wrote:
>
>                    Yes, this is the type of output I was expecting.
>                 Now you need to produce it for a large case.
>
>                    Barry
>
>                     On Feb 16, 2016, at 11:49 PM, Rongliang Chen
>                     <rongliang.chan at gmail.com
>                     <mailto:rongliang.chan at gmail.com>> wrote:
>
>                     Hi Barry,
>
>                     I run the code with valgrind on a workstation for
>                     a smaller case and it produces some ASCII
>                     information (see attached).  Is this helpful?
>
>                     Best,
>                     Rongliang
>
>                     On 02/17/2016 01:30 PM, Barry Smith wrote:
>
>                            Hmm, something didn't work right with
>                         massif. I should give a bunch of ASCII
>                         information about how much memory is used at
>                         different times in the code. You may need to
>                         play around with the massif options and google
>                         documentation on how to get it to provide
>                         useful information. Once it produces the
>                         useful information it will be very helpful.
>
>                         time=0
>                         mem_heap_B=0
>                         mem_heap_extra_B=0
>                         mem_stacks_B=0
>                         heap_tree=empty
>
>
>                             On Feb 16, 2016, at 9:44 PM, Rongliang
>                             Chen <rongliang.chan at gmail.com
>                             <mailto:rongliang.chan at gmail.com>> wrote:
>
>                             Dear Barry,
>
>                             Many thanks for your reply.
>
>                             I checked with the valgrind and did not
>                             obtain any outputs (massif.out.<pid>)
>                             because the job was killed before it
>                             reached the end.
>
>                             Then I switch to a smaller case, it works
>                             well and one of the output is attached (I
>                             did not find any useful information in
>                             it). The output with the option
>                             -mat_coloring_view is followed, which
>                             shows that the number of colors is 65. Any
>                             ideas for this?
>
>                             MatColoring Object: 480 MPI processes
>                                type: sl
>                                Weight type: RANDOM
>                                Distance 2, Max. Colors 65535
>                                Number of colors 65
>                                Number of total columns 1637350
>
>                             Best regards,
>                             Rongliang
>
>                             On 02/17/2016 01:13 AM, Barry Smith wrote:
>
>                                    How many colors are needed?
>
>                                    You need to produce a breakdown of
>                                 where all the memory is being used.
>                                 For example valgrind with the
>                                 http://valgrind.org/docs/manual/ms-manual.html
>
>
>                                    Barry
>
>
>                                     On Feb 16, 2016, at 6:45 AM,
>                                     Rongliang Chen
>                                     <rongliang.chan at gmail.com
>                                     <mailto:rongliang.chan at gmail.com>>
>                                       wrote:
>
>                                     Dear all,
>
>                                     I am using the DMPlex to solve a
>                                     PDE on a unstructured mesh and I
>                                     use the
>                                     SNESComputeJacobianDefaultColor to
>                                     compute the Jacobian matrix.
>
>                                     My code works well for small
>                                     problems (such as problem with
>                                     3.3x10^5 cells using 120 cores)
>                                     but when I increase the number of
>                                     cells (2.6x10^6 cells using 1920
>                                     cores), I go out of memory in the
>                                     function MatColoringApply. It
>                                     shows that one of the cores uses
>                                     over 11G memory. I think this is
>                                     unreasonable. Do you have any
>                                     suggestions for debugging this
>                                     problem?
>
>                                     Best regards,
>                                     Rongliang
>
>                             <massif.out.12562>
>
>                     <massif.out.26358>
>
>             <massif.out.26539>
>
>
>
>
>
> -- 
> What most experimenters take for granted before they begin their 
> experiments is infinitely more interesting than any results to which 
> their experiments lead.
> -- Norbert Wiener

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160219/144b7156/attachment-0001.html>


More information about the petsc-users mailing list