[petsc-users] parallel IO messages
Dave May
dave.mayhem23 at gmail.com
Fri Nov 27 13:08:48 CST 2015
There is little information in this stack trace.
You would get more information if you use a debug build of petsc.
e.g. configure with --with-debugging=yes
It is recommended to always debug problems using a debug build of petsc and
a debug build of your application.
Thanks,
Dave
On 27 November 2015 at 20:05, Fande Kong <fdkong.jd at gmail.com> wrote:
> Hi all,
>
> I implemented a parallel IO based on the Vec and IS which uses HDF5. I am
> testing this loader on a supercomputer. I occasionally (not always)
> encounter the following errors (using 8192 cores):
>
> [7689]PETSC ERROR:
> ------------------------------------------------------------------------
> [7689]PETSC ERROR: Caught signal number 5 TRAP
> [7689]PETSC ERROR: Try option -start_in_debugger or
> -on_error_attach_debugger
> [7689]PETSC ERROR: or see
> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
> [7689]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac
> OS X to find memory corruption errors
> [7689]PETSC ERROR: configure using --with-debugging=yes, recompile, link,
> and run
> [7689]PETSC ERROR: to get more information on the crash.
> [7689]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
> [7689]PETSC ERROR: Signal received
> [7689]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
> for trouble shooting.
> [7689]PETSC ERROR: Petsc Release Version 3.6.2, unknown
> [7689]PETSC ERROR: ./fsi on a arch-linux2-cxx-opt named ys6103 by fandek
> Fri Nov 27 11:26:30 2015
> [7689]PETSC ERROR: Configure options --with-clanguage=cxx
> --with-shared-libraries=1 --download-fblaslapack=1 --with-mpi=1
> --download-parmetis=1 --download-metis=1 --with-netcdf=1
> --download-exodusii=1
> --with-hdf5-dir=/glade/apps/opt/hdf5-mpi/1.8.12/intel/12.1.5
> --with-debugging=no --with-c2html=0 --with-64-bit-indices=1
> [7689]PETSC ERROR: #1 User provided function() line 0 in unknown file
> Abort(59) on node 7689 (rank 7689 in comm 1140850688): application called
> MPI_Abort(MPI_COMM_WORLD, 59) - process 7689
> ERROR: 0031-300 Forcing all remote tasks to exit due to exit code 1 in
> task 7689
>
> Make and configure logs are attached.
>
> Thanks,
>
> Fande Kong,
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20151127/1a945919/attachment.html>
More information about the petsc-users
mailing list