[petsc-dev] [mpich-discuss] Is mpich/master:a8a2b30fd21 tested with Petsc?

Matthew Knepley knepley at gmail.com
Tue Mar 27 16:04:57 CDT 2018


On Tue, Mar 27, 2018 at 4:59 PM, Min Si <msi at anl.gov> wrote:

> Hi Eric,
>
> It will be great if you could give us a simple MPI program (not with
> PETSc) to reproduce this issue. If this is a problem happens only when
> PETSc is involved, the PETSc team can give you more suggestions.
>

Hi Min,

It is really easy to run PETSc at ANL. I am sure one of us can help if you
cannot reproduce this bug on your own.

  Thanks,

     Matt


> Thanks,
> Min
>
>
> On 2018/03/27 15:38, Eric Chamberland wrote:
>
>> Hi,
>>
>> since more than 2 weeks that the master branch of mpich is still and it
>> can be reproduced with a simple "make test" after a fresh installation of
>> PETSc...
>>
>> Is anyone testing it?
>>
>> Is it supposed to be working?
>>
>> Just tell me if I should "follow" another mpich branch please.
>>
>> Thanks,
>>
>> Eric
>>
>>
>> On 14/03/18 03:35 AM, Eric Chamberland wrote:
>>
>>> Hi,
>>>
>>> fwiw, the actual mpich/master branch doesn't passes the PETSc "make
>>> test" after a fresh installation...  It hangs just afer the 1 MPI process
>>> test, meaning it is locked into the 2 process test:
>>>
>>> make PETSC_DIR=/pmi/cmpbib/compilation_BIB_dernier_mpich/COMPILE_
>>> AUTO/mpich-3.x-debug/petsc-3.8.3-debug PETSC_ARCH=arch-linux2-c-debug
>>> test
>>> Running test examples to verify correct installation
>>> Using PETSC_DIR=/pmi/cmpbib/compilation_BIB_dernier_mpich/COMPILE_
>>> AUTO/mpich-3.x-debug/petsc-3.8.3-debug and
>>> PETSC_ARCH=arch-linux2-c-debug
>>> C/C++ example src/snes/examples/tutorials/ex19 run successfully with 1
>>> MPI process
>>>
>>>
>>>
>>>
>>> ^Cmakefile:151: recipe for target 'test' failed
>>> make: [test] Interrupt (ignored)
>>>
>>> thanks,
>>>
>>> Eric
>>>
>>> On 13/03/18 08:07 AM, Eric Chamberland wrote:
>>>
>>>>
>>>> Hi,
>>>>
>>>> each night we are testing mpich/master with our petsc-based code. I
>>>> don't know if PETSc team is doing the same thing with mpich/master?
>>>> (Maybe it is a good idea?)
>>>>
>>>> Everything was fine (except the issue https://github.com/pmodels/mpi
>>>> ch/issues/2892) up to commit 7b8d64debd, but since commit
>>>> mpich:a8a2b30fd21), I have a segfault on a any parallel nightly test.
>>>>
>>>> For example, a 2 process test ends at almost different execution points:
>>>>
>>>> rank 0:
>>>>
>>>> #003: /lib64/libpthread.so.0(+0xf870) [0x7f25bf908870]
>>>> #004: /pmi/cmpbib/compilation_BIB_dernier_mpich/COMPILE_AUTO/BIB/bin/BIBMEFGD.opt()
>>>> [0x64a788]
>>>> #005: /lib64/libc.so.6(+0x35140) [0x7f25bca18140]
>>>> #006: /lib64/libc.so.6(__poll+0x2d) [0x7f25bcabfbfd]
>>>> #007: /opt/mpich-3.x_debug/lib/libmpi.so.0(+0x1e4cc9) [0x7f25bd90ccc9]
>>>> #008: /opt/mpich-3.x_debug/lib/libmpi.so.0(+0x1ea55c) [0x7f25bd91255c]
>>>> #009: /opt/mpich-3.x_debug/lib/libmpi.so.0(+0xba657) [0x7f25bd7e2657]
>>>> #010: /opt/mpich-3.x_debug/lib/libmpi.so.0(PMPI_Waitall+0xe3)
>>>> [0x7f25bd7e3343]
>>>> #011: /opt/petsc-3.8.3_debug_mpich-3.x_debug/lib/libpetsc.so.3.8(P
>>>> etscGatherMessageLengths+0x654) [0x7f25c4bb3193]
>>>> #012: /opt/petsc-3.8.3_debug_mpich-3.x_debug/lib/libpetsc.so.3.8(VecScatterCreate_PtoS+0x859)
>>>> [0x7f25c4e82d7f]
>>>> #013: /opt/petsc-3.8.3_debug_mpich-3.x_debug/lib/libpetsc.so.3.8(VecScatterCreate+0x5684)
>>>> [0x7f25c4e4d055]
>>>> #014: /opt/petsc-3.8.3_debug_mpich-3.x_debug/lib/libpetsc.so.3.8(VecCreateGhostWithArray+0x688)
>>>> [0x7f25c4e01a39]
>>>> #015: /opt/petsc-3.8.3_debug_mpich-3.x_debug/lib/libpetsc.so.3.8(VecCreateGhost+0x179)
>>>> [0x7f25c4e020f6]
>>>>
>>>> rank 1:
>>>>
>>>> #002: /pmi/cmpbib/compilation_BIB_dernier_mpich/COMPILE_AUTO/GIREF
>>>> /lib/libgiref_opt_Util.so(traitementSignal+0x2bd0) [0x7f62df8e7310]
>>>> #003: /lib64/libc.so.6(+0x35140) [0x7f62d3bc9140]
>>>> #004: /lib64/libc.so.6(__poll+0x2d) [0x7f62d3c70bfd]
>>>> #005: /opt/mpich-3.x_debug/lib/libmpi.so.0(+0x1e4cc9) [0x7f62d4abdcc9]
>>>> #006: /opt/mpich-3.x_debug/lib/libmpi.so.0(+0x1ea55c) [0x7f62d4ac355c]
>>>> #007: /opt/mpich-3.x_debug/lib/libmpi.so.0(+0x12c9c5) [0x7f62d4a059c5]
>>>> #008: /opt/mpich-3.x_debug/lib/libmpi.so.0(+0x12e102) [0x7f62d4a07102]
>>>> #009: /opt/mpich-3.x_debug/lib/libmpi.so.0(+0xf17a1) [0x7f62d49ca7a1]
>>>> #010: /opt/mpich-3.x_debug/lib/libmpi.so.0(+0x3facf) [0x7f62d4918acf]
>>>> #011: /opt/mpich-3.x_debug/lib/libmpi.so.0(+0x3fc3d) [0x7f62d4918c3d]
>>>> #012: /opt/mpich-3.x_debug/lib/libmpi.so.0(+0xf18d8) [0x7f62d49ca8d8]
>>>> #013: /opt/mpich-3.x_debug/lib/libmpi.so.0(+0x3fb88) [0x7f62d4918b88]
>>>> #014: /opt/mpich-3.x_debug/lib/libmpi.so.0(+0x3fc3d) [0x7f62d4918c3d]
>>>> #015: /opt/mpich-3.x_debug/lib/libmpi.so.0(MPI_Barrier+0x27b)
>>>> [0x7f62d4918edb]
>>>> #016: /opt/petsc-3.8.3_debug_mpich-3.x_debug/lib/libpetsc.so.3.8(PetscCommGetNewTag+0x3ff)
>>>> [0x7f62dbceb055]
>>>> #017: /opt/petsc-3.8.3_debug_mpich-3.x_debug/lib/libpetsc.so.3.8(PetscObjectGetNewTag+0x15d)
>>>> [0x7f62dbceaadb]
>>>> #018: /opt/petsc-3.8.3_debug_mpich-3.x_debug/lib/libpetsc.so.3.8(V
>>>> ecScatterCreateCommon_PtoS+0x1ee) [0x7f62dc03625c]
>>>> #019: /opt/petsc-3.8.3_debug_mpich-3.x_debug/lib/libpetsc.so.3.8(VecScatterCreate_PtoS+0x29c4)
>>>> [0x7f62dc035eea]
>>>> #020: /opt/petsc-3.8.3_debug_mpich-3.x_debug/lib/libpetsc.so.3.8(VecScatterCreate+0x5684)
>>>> [0x7f62dbffe055]
>>>> #021: /opt/petsc-3.8.3_debug_mpich-3.x_debug/lib/libpetsc.so.3.8(VecCreateGhostWithArray+0x688)
>>>> [0x7f62dbfb2a39]
>>>> #022: /opt/petsc-3.8.3_debug_mpich-3.x_debug/lib/libpetsc.so.3.8(VecCreateGhost+0x179)
>>>> [0x7f62dbfb30f6]
>>>>
>>>> Have some other users (PETSc users?) reported problem?
>>>>
>>>> Thanks,
>>>>
>>>> Eric
>>>>
>>>> ps: usual informations:
>>>>
>>>> mpich logs:
>>>> http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2018.03.
>>>> 12.05h39m54s_config.log
>>>> http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2018.03.
>>>> 12.05h39m54s_config.system
>>>> http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2018.03.
>>>> 12.05h39m54s_mpich_version.txt
>>>> http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2018.03.
>>>> 12.05h39m54s_c.txt
>>>> http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2018.03.
>>>> 12.05h39m54s_m.txt
>>>> http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2018.03.
>>>> 12.05h39m54s_mi.txt
>>>> http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2018.03.
>>>> 12.05h39m54s_openmpa_config.log
>>>> http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2018.03.
>>>> 12.05h39m54s_mpl_config.log
>>>> http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2018.03.
>>>> 12.05h39m54s_pm_hydra_config.log
>>>> http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2018.03.
>>>> 12.05h39m54s_pm_hydra_tools_topo_config.log
>>>> http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2018.03.
>>>> 12.05h39m54s_mpiexec_info.txt
>>>>
>>>> Petsc logs:
>>>> http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2018.03.
>>>> 12.05h39m54s_configure.log
>>>> http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2018.03.
>>>> 12.05h39m54s_make.log
>>>> http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2018.03.
>>>> 12.05h39m54s_default.log
>>>> http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2018.03.
>>>> 12.05h39m54s_RDict.log
>>>> http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2018.03.
>>>> 12.05h39m54s_CMakeLists.txt
>>>>
>>>>
>>>>
>>> _______________________________________________
>> discuss mailing list     discuss at mpich.org
>> To manage subscription options or unsubscribe:
>> https://lists.mpich.org/mailman/listinfo/discuss
>>
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.caam.rice.edu/~mk51/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20180327/8b037d4d/attachment-0001.html>


More information about the petsc-dev mailing list