[petsc-users] Debug AOCreateBasic

Jed Brown jedbrown at mcs.anl.gov
Fri Nov 1 00:35:25 CDT 2013


Rongliang Chen <rongliang.chan at gmail.com> writes:

> Hi there,
>
> My code died in the AOCreateBasic and the error messages are followed. 
> Do you have any suggestions to debug this?

1. Make sure your code is valgrind-clean for small sizes (to provide
more evidence that it is getting the right answer for the right reason).

2. Try MPICH instead to see if you error in the same place.

3. Set up your system to dump core on selected ranks.

> Notes:
> 1. In this case, it has about 30,000,000 unstructured mesh and use 96 
> processors (I also tried 1024 processors and it has the same problem).
> 2. My code works well for a smaller case (about 25,000,000 unstructured 
> meshes) .
> 3. I also check the memory usage of this case and it is very small 
> because the solution stage does not start yet.
>
> Best,
> Rongliang
>
> --------------------------------------------------------------------------
> MPI_ABORT was invoked on rank 1 in communicator MPI_COMM_WORLD
> with errorcode 59.
>
> NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
> You may or may not see output from other processes, depending on
> exactly when Open MPI kills them.
> --------------------------------------------------------------------------
> [0]PETSC ERROR: 
> ------------------------------------------------------------------------
> [0]PETSC ERROR: Caught signal number 15 Terminate: Somet process (or the 
> batch system) has told this process to end
> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
> [0]PETSC ERROR: or see 
> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC 
> ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to 
> find memory corruption errors
> [0]PETSC ERROR: likely location of problem given in stack below
> [0]PETSC ERROR: ---------------------  Stack Frames 
> ------------------------------------
> [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
> [0]PETSC ERROR:       INSTEAD the line number of the start of the function
> [0]PETSC ERROR:       is given.
> [0]PETSC ERROR: [0] AOCreate_Basic line 203 
> src/vec/is/ao/impls/basic/aobasic.c
> [0]PETSC ERROR: [0] AOSetType line 35 src/vec/is/ao/interface/aoreg.c
> [0]PETSC ERROR: [0] AOCreateBasicIS line 380 
> src/vec/is/ao/impls/basic/aobasic.c
> [0]PETSC ERROR: [0] AOCreateBasic line 335 
> src/vec/is/ao/impls/basic/aobasic.c
> [0]PETSC ERROR: [0] DataPartitionVertices_Block line 1634 
> /projects/ronglian/soft/3Dfluid_new/3DWindturbine/WindturbineFor3.4/codes/readbinary3d.c
> [0]PETSC ERROR: [0] ReadBinary line 184 
> /projects/ronglian/soft/3Dfluid_new/3DWindturbine/WindturbineFor3.4/codes/readbinary3d.c
> [0]PETSC ERROR: [0] LoadGrid line 720 
> /projects/ronglian/soft/3Dfluid_new/3DWindturbine/WindturbineFor3.4/codes/loadgrid3d.c
> [0]PETSC ERROR: --------------------- Error Message 
> ------------------------------------
> [0]PETSC ERROR: Signal received!
> [0]PETSC ERROR: 
> ------------------------------------------------------------------------
> [0]PETSC ERROR: Petsc Development GIT revision: 
> ee17fca9fd6ac48e6579ef235144daafbb22b801  GIT Date: 2013-10-23 14:21:20 
> -0500
> [0]PETSC ERROR: See docs/changes/index.html for recent updates.
> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
> [0]PETSC ERROR: See docs/index.html for manual pages.
> [0]PETSC ERROR: 
> ------------------------------------------------------------------------
> [0]PETSC ERROR: ./fsi3d on a Janus-debug-64bit named node0880 by 
> ronglian Thu Oct 31 21:58:20 2013
> [0]PETSC ERROR: Libraries linked from 
> /projects/ronglian/soft/petsc-dev-latest/Janus-debug-64bit/lib
> [0]PETSC ERROR: Configure run at Thu Oct 24 21:24:31 2013
> [0]PETSC ERROR: Configure options --known-level1-dcache-size=32768 
> --known-level1-dcache-linesize=64 --known-level1-dcache-assoc=8 
> --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8 
> --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8 
> --known-sizeof-long-long=8 --known-sizeof-float=4 
> --known-sizeof-double=8 --known-sizeof-size_t=8 --known-bits-per-byte=8 
> --known-sizeof-MPI_Comm=8 --known-sizeof-MPI_Fint=4 
> --known-mpi-long-double=1 --known-mpi-c-double-complex=0 
> --download-blacs=1 --download-f-blas-lapack=1 --download-metis=1 
> --download-parmetis=1 --download-scalapack=1 --download-superlu_dist=1 
> --known-mpi-shared-libraries=0 --with-64-bit-indices --with-batch=1 
> --with-mpi-shared=1 --download-exodusii=1 --download-hdf5=1 
> --download-netcdf=1 --known-64-bit-blas-indices --with-debugging=1 
> COPTFLAGS="-O0 -g"
> [0]PETSC ERROR: 
> ------------------------------------------------------------------------
> [0]PETSC ERROR: User provided function() line 0 in  unknown file
> [1]PETSC ERROR: 
> ------------------------------------------------------------------------
> [1]PETSC ERROR: Caught signal number 15 Terminate: Somet process (or the 
> batch system) has told this process to end
> [1]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
> [1]PETSC ERROR: or see 
> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[1]PETSC 
> ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to 
> find memory corruption errors
> [1]PETSC ERROR: likely location of problem given in stack below
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 835 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20131031/8db8f20f/attachment.pgp>


More information about the petsc-users mailing list