[petsc-users] parallel IO messages

Barry Smith bsmith at mcs.anl.gov
Fri Nov 27 14:20:18 CST 2015


  Edit PETSC_ARCH/include/petscconf.h and add

#if !defined(PETSC_MISSING_SIGTRAP)
#define PETSC_MISSING_SIGTRAP
#endif

then do

make gnumake

It is possible that they system you are using uses SIGTRAP in managing the IO; by making the change above you are telling PETSc to ignore SIGTRAPS.   Let us know how this works out.

   Barry


> On Nov 27, 2015, at 1:05 PM, Fande Kong <fdkong.jd at gmail.com> wrote:
> 
> Hi all,
> 
> I implemented a parallel IO based on the Vec and IS which uses HDF5. I am testing this loader on a supercomputer. I occasionally (not always) encounter the following errors (using 8192 cores):
> 
> [7689]PETSC ERROR: ------------------------------------------------------------------------
> [7689]PETSC ERROR: Caught signal number 5 TRAP
> [7689]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
> [7689]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
> [7689]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
> [7689]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run 
> [7689]PETSC ERROR: to get more information on the crash.
> [7689]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
> [7689]PETSC ERROR: Signal received
> [7689]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> [7689]PETSC ERROR: Petsc Release Version 3.6.2, unknown 
> [7689]PETSC ERROR: ./fsi on a arch-linux2-cxx-opt named ys6103 by fandek Fri Nov 27 11:26:30 2015
> [7689]PETSC ERROR: Configure options --with-clanguage=cxx --with-shared-libraries=1 --download-fblaslapack=1 --with-mpi=1 --download-parmetis=1 --download-metis=1 --with-netcdf=1 --download-exodusii=1 --with-hdf5-dir=/glade/apps/opt/hdf5-mpi/1.8.12/intel/12.1.5 --with-debugging=no --with-c2html=0 --with-64-bit-indices=1
> [7689]PETSC ERROR: #1 User provided function() line 0 in  unknown file
> Abort(59) on node 7689 (rank 7689 in comm 1140850688): application called MPI_Abort(MPI_COMM_WORLD, 59) - process 7689
> ERROR: 0031-300  Forcing all remote tasks to exit due to exit code 1 in task 7689 
> 
> Make and configure logs are attached.
> 
> Thanks,
> 
> Fande Kong,
> 
> <configure_log><make_log>



More information about the petsc-users mailing list