[petsc-users] Error with parallel solve

Mark Adams mfadams at lbl.gov
Mon Apr 8 13:33:40 CDT 2019


On Mon, Apr 8, 2019 at 2:23 PM Manav Bhatia <bhatiamanav at gmail.com> wrote:

> Thanks for identifying this, Mark.
>
> If I compile the debug version of Petsc, will it also build a debug
> version of Mumps?
>

The debug compiler flags will get passed down to MUMPS if you are
downloading MUMPS in PETSc. Otherwise, yes build a debug version.

Are you able to run the exact same job on your Mac? ie, same number of
processes, etc.


>
> On Apr 8, 2019, at 12:58 PM, Mark Adams <mfadams at lbl.gov> wrote:
>
> This looks like an error in MUMPS:
>
>         IF ( IROW_GRID .NE. root%MYROW .OR.
>      &       JCOL_GRID .NE. root%MYCOL ) THEN
>             WRITE(*,*) MYID,':INTERNAL Error: recvd root arrowhead '
>
>
> On Mon, Apr 8, 2019 at 1:37 PM Smith, Barry F. via petsc-users <
> petsc-users at mcs.anl.gov> wrote:
>
>>   Difficult to tell what is going on.
>>
>>   The message User provided function() line 0 in  unknown file  indicates
>> the crash took place OUTSIDE of PETSc code and error message INTERNAL
>> Error: recvd root arrowhead  is definitely not coming from PETSc.
>>
>>    Yes, debug with the debug version and also try valgrind.
>>
>>    Barry
>>
>>
>> > On Apr 8, 2019, at 12:12 PM, Manav Bhatia via petsc-users <
>> petsc-users at mcs.anl.gov> wrote:
>> >
>> >
>> > Hi,
>> >
>> >     I am running a code a nonlinear simulation using mesh-refinement on
>> libMesh. The code runs without issues on a Mac (can run for days without
>> issues), but crashes on Linux (Centos 6). I am using version 3.11 on Linux
>> with openmpi 3.1.3 and gcc8.2.
>> >
>> >     I tried to use the -on_error_attach_debugger, but it only gave me
>> this message. Does this message imply something to the more experienced
>> eyes?
>> >
>> >     I am going to try to build a debug version of petsc to figure out
>> what is going wrong. I will get and share more detailed logs in a bit.
>> >
>> > Regards,
>> > Manav
>> >
>> > [8]PETSC ERROR:
>> ------------------------------------------------------------------------
>> > [8]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
>> probably memory access out of range
>> > [8]PETSC ERROR: Try option -start_in_debugger or
>> -on_error_attach_debugger
>> > [8]PETSC ERROR: or see
>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>> > [8]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac
>> OS X to find memory corruption errors
>> > [8]PETSC ERROR: configure using --with-debugging=yes, recompile, link,
>> and run
>> > [8]PETSC ERROR: to get more information on the crash.
>> > [8]PETSC ERROR: User provided function() line 0 in  unknown file
>> > PETSC: Attaching gdb to
>> /cavs/projects/brg_codes/users/bhatia/mast/mast_topology/opt/examples/structural/example_5/structural_example_5
>> of pid 2108 on display localhost:10.0 on machine Warhawk1.HPC.MsState.Edu
>> <http://warhawk1.hpc.msstate.edu/>
>> > PETSC: Attaching gdb to
>> /cavs/projects/brg_codes/users/bhatia/mast/mast_topology/opt/examples/structural/example_5/structural_example_5
>> of pid 2112 on display localhost:10.0 on machine Warhawk1.HPC.MsState.Edu
>> <http://warhawk1.hpc.msstate.edu/>
>> >            0 :INTERNAL Error: recvd root arrowhead
>> >            0 :not belonging to me. IARR,JARR=       67525       67525
>> >            0 :IROW_GRID,JCOL_GRID=           0           4
>> >            0 :MYROW, MYCOL=           0           0
>> >            0 :IPOSROOT,JPOSROOT=    92264688    92264688
>> >
>> --------------------------------------------------------------------------
>> > MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
>> > with errorcode -99.
>> >
>> > NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
>> > You may or may not see output from other processes, depending on
>> > exactly when Open MPI kills them.
>> >
>> --------------------------------------------------------------------------
>> >
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20190408/a4e5038c/attachment.html>


More information about the petsc-users mailing list