[petsc-users] signal received error; MatNullSpaceTest; Stokes flow solver with pc fieldsplit and schur complement

Jed Brown jedbrown at mcs.anl.gov
Thu Oct 17 09:39:20 CDT 2013


Bishesh Khanal <bisheshkh at gmail.com> writes:

>> I tried running on the cluster with one core per node with 4 nodes and I
> got the following errors (note: using valgrind, and openmpi of the cluster)
> at the very end after the many usual "unconditional jump ... errors"  which
> might be interesting
>
> mpiexec: killing job...
>
> mpiexec: abort is already in progress...hit ctrl-c again to forcibly
> terminate
>
> --------------------------------------------------------------------------
> MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
> with errorcode 59.
>
> NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
> You may or may not see output from other processes, depending on
> exactly when Open MPI kills them.
> --------------------------------------------------------------------------
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: Caught signal number 15 Terminate: Somet process (or the
> batch system) has told this process to end

Memory corruption generally results in SIGSEGV, so I suspect this is
still either a memory issue or some other resource issue.  How much
memory is available on these compute nodes?  Do turn off Valgrind for
this run; it takes a lot of memory.

> Does it mean it is crashing near MatSetValues_MPIAIJ ?

Possibly, but it could be killing the program for other reasons.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 835 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20131017/d7c13a03/attachment.pgp>


More information about the petsc-users mailing list