[petsc-users] petsc hangs

Matthew Knepley knepley at gmail.com
Fri Jul 24 11:42:31 CDT 2015


On Fri, Jul 24, 2015 at 11:36 AM, Aaron Kitzmiller <
akitzmiller at g.harvard.edu> wrote:

> futex is a Linux system call used for locking shared resources.
>
> It could be indicative of an MPI problem.  I wouldn't be surprised.  If
> anyone has any idea how to get around it that would be great.  We have
> dozens of applications on our compute cluster that use MPI, this version
> being our default.  I'm wondering if there is something specific to the mix
> of MPI flavor / compiler, etc. that could be going on here.
>

Yes, this is a bug in OpenMPI that has been open for years.

Can you please switch to MPICH and try another test? I thought the newest
version of OpenMPI had fixed this, but maybe you are using an older release.

  Thanks,

    Matt


> This is the gdb stack trace:
>
> #0  0x00000039c6a0e264 in __lll_lock_wait () from /lib64/libpthread.so.0
> #1  0x00000039c6a09508 in _L_lock_854 () from /lib64/libpthread.so.0
> #2  0x00000039c6a093d7 in pthread_mutex_lock () from /lib64/libpthread.so.0
> #3  0x00002aaaaf13ddd4 in opal_mutex_lock (attr_hash=0x2aaaaf651c70,
> key=128, attribute=0x7fffffffc200, flag=0xffffffffffffffff)
>     at ../opal/threads/mutex_unix.h:104
> #4  ompi_attr_get_c (attr_hash=0x2aaaaf651c70,
> key=128, attribute=0x7fffffffc200, flag=0xffffffffffffffff)
>     at attribute/attribute.c:758
> #5  0x00002aaaaf17080e in PMPI_Attr_get (comm=0x2aaaaf651c70, keyval=128,
> attribute_val=0x7fffffffc200, flag=0xffffffffffffffff)
>     at pattr_get.c:61
> #6  0x00002aaaaacad0b3 in Petsc_DelComm_Outer (comm=0x2aaaaf6d4140,
> keyval=13, attr_val=0x7af160, extra_state=0x0)
>     at /n/home08/lchristakis/petsc/petsc-3.5.4/src/sys/objects/pinit.c:409
> #7  0x00002aaaaf13f1a4 in ompi_attr_delete_impl
> (type=2942639216, object=0x80, attr_hash=0x7fffffffc200, key=-1,
> predefined=112 'p')
>     at attribute/attribute.c:970
> #8  0x00002aaaaf13ee02 in ompi_attr_delete (type=2942639216, object=0x80,
> attr_hash=0x7fffffffc200, key=-1, predefined=112 'p')
>     at attribute/attribute.c:1019
> #9  0x00002aaaaf170710 in PMPI_Attr_delete
> (comm=0x2aaaaf651c70, keyval=128) at pattr_delete.c:59
> #10 0x00002aaaaac61848 in PetscCommDestroy (comm=0x888cf0)
> at /n/home08/lchristakis/petsc/petsc-3.5.4/src/sys/objects/tagm.c:256
> #11 0x00002aaaaac6a273 in PetscHeaderDestroy_Private (h=0x888ce0)
>     at
> /n/home08/lchristakis/petsc/petsc-3.5.4/src/sys/objects/inherit.c:121
> #12 0x00002aaaaaf51512 in VecDestroy (v=0x7fffffffcbd0)
>     at
> /n/home08/lchristakis/petsc/petsc-3.5.4/src/vec/vec/interface/vector.c:434
> #13 0x00002aaaab9c5c7f in DMSetUp_DA_2D (da=0x87b1b0)
> at /n/home08/lchristakis/petsc/petsc-3.5.4/src/dm/impls/da/da2.c:776
> #14 0x00002aaaaba73bfd in DMSetUp_DA (da=0x87b1b0)
> at /n/home08/lchristakis/petsc/petsc-3.5.4/src/dm/impls/da/dareg.c:25
> #15 0x00002aaaab93399a in DMSetUp (dm=0x87b1b0)
> at /n/home08/lchristakis/petsc/petsc-3.5.4/src/dm/interface/dm.c:560
> #16 0x00002aaaab9c6941 in DMDACreate2d
> (comm=0x2aaaaf6d45c0, bx=DM_BOUNDARY_NONE, by=DM_BOUNDARY_NONE,
>     stencil_type=DMDA_STENCIL_STAR, M=-4, N=-4, m=-1, n=-1, dof=1, s=1,
> lx=0x0, ly=0x0, da=0x7fffffffd668)
>     at /n/home08/lchristakis/petsc/petsc-3.5.4/src/dm/impls/da/da2.c:862
> #17 0x00000000004023d0 in main (argc=1, argv=0x7fffffffd8c8)
>     at
> /n/home08/lchristakis/petsc/petsc-3.5.4/src/snes/examples/tutorials/ex5.c:116
>
>
> Aaron Kitzmiller
> Informatics and Scientific Applications
> aaron_kitzmiller at harvard.edu
>
>
>
> On Jul 24, 2015, at 12:18 PM, Matthew Knepley <knepley at gmail.com> wrote:
>
> On Fri, Jul 24, 2015 at 11:17 AM, Matthew Knepley <knepley at gmail.com>
> wrote:
>
>> On Fri, Jul 24, 2015 at 11:09 AM, Aaron Kitzmiller <
>> akitzmiller at g.harvard.edu> wrote:
>>
>>> Doesn't run.  Hangs just like the tests do.
>>>
>>> I doubt it's helpful, but when I run it under strace, it hangs on a
>>> "futex".  The last thing vaguely informative was an attempt to read the
>>> non-existent .petscrc.
>>>
>>
>> Run in the debugger and get a stack trace.
>>
>
> Also futex does not appear in the PETSc source:
>
>    knepley/feature-snes-deflation *+$|MERGING:/PETSc3/petsc/petsc-dev$
> find src -name "*.c" | xargs grep futex
>    find src -name "*.c" | xargs grep futex
>
> You have an MPI problem.
>
>    Matt
>
>
>>   Matt
>>
>>
>>> ajk
>>>
>>> Aaron Kitzmiller
>>> Informatics and Scientific Applications
>>> aaron_kitzmiller at harvard.edu
>>>
>>>
>>>
>>> On Jul 24, 2015, at 11:21 AM, Matthew Knepley <knepley at gmail.com> wrote:
>>>
>>>   ./ex5 -snes_monitor
>>>
>>>
>>>
>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150724/eb569d8e/attachment.html>


More information about the petsc-users mailing list