[petsc-users] Error with parallel solve
Mark Adams
mfadams at lbl.gov
Mon Apr 8 12:58:55 CDT 2019
This looks like an error in MUMPS:
IF ( IROW_GRID .NE. root%MYROW .OR.
& JCOL_GRID .NE. root%MYCOL ) THEN
WRITE(*,*) MYID,':INTERNAL Error: recvd root arrowhead '
On Mon, Apr 8, 2019 at 1:37 PM Smith, Barry F. via petsc-users <
petsc-users at mcs.anl.gov> wrote:
> Difficult to tell what is going on.
>
> The message User provided function() line 0 in unknown file indicates
> the crash took place OUTSIDE of PETSc code and error message INTERNAL
> Error: recvd root arrowhead is definitely not coming from PETSc.
>
> Yes, debug with the debug version and also try valgrind.
>
> Barry
>
>
> > On Apr 8, 2019, at 12:12 PM, Manav Bhatia via petsc-users <
> petsc-users at mcs.anl.gov> wrote:
> >
> >
> > Hi,
> >
> > I am running a code a nonlinear simulation using mesh-refinement on
> libMesh. The code runs without issues on a Mac (can run for days without
> issues), but crashes on Linux (Centos 6). I am using version 3.11 on Linux
> with openmpi 3.1.3 and gcc8.2.
> >
> > I tried to use the -on_error_attach_debugger, but it only gave me
> this message. Does this message imply something to the more experienced
> eyes?
> >
> > I am going to try to build a debug version of petsc to figure out
> what is going wrong. I will get and share more detailed logs in a bit.
> >
> > Regards,
> > Manav
> >
> > [8]PETSC ERROR:
> ------------------------------------------------------------------------
> > [8]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
> probably memory access out of range
> > [8]PETSC ERROR: Try option -start_in_debugger or
> -on_error_attach_debugger
> > [8]PETSC ERROR: or see
> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
> > [8]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac
> OS X to find memory corruption errors
> > [8]PETSC ERROR: configure using --with-debugging=yes, recompile, link,
> and run
> > [8]PETSC ERROR: to get more information on the crash.
> > [8]PETSC ERROR: User provided function() line 0 in unknown file
> > PETSC: Attaching gdb to
> /cavs/projects/brg_codes/users/bhatia/mast/mast_topology/opt/examples/structural/example_5/structural_example_5
> of pid 2108 on display localhost:10.0 on machine Warhawk1.HPC.MsState.Edu
> > PETSC: Attaching gdb to
> /cavs/projects/brg_codes/users/bhatia/mast/mast_topology/opt/examples/structural/example_5/structural_example_5
> of pid 2112 on display localhost:10.0 on machine Warhawk1.HPC.MsState.Edu
> > 0 :INTERNAL Error: recvd root arrowhead
> > 0 :not belonging to me. IARR,JARR= 67525 67525
> > 0 :IROW_GRID,JCOL_GRID= 0 4
> > 0 :MYROW, MYCOL= 0 0
> > 0 :IPOSROOT,JPOSROOT= 92264688 92264688
> >
> --------------------------------------------------------------------------
> > MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
> > with errorcode -99.
> >
> > NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
> > You may or may not see output from other processes, depending on
> > exactly when Open MPI kills them.
> >
> --------------------------------------------------------------------------
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20190408/d1a86834/attachment.html>
More information about the petsc-users
mailing list