<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Tue, Mar 27, 2018 at 4:59 PM, Min Si <span dir="ltr"><<a href="mailto:msi@anl.gov" target="_blank">msi@anl.gov</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi Eric,<br>
<br>
It will be great if you could give us a simple MPI program (not with PETSc) to reproduce this issue. If this is a problem happens only when PETSc is involved, the PETSc team can give you more suggestions.<br></blockquote><div><br></div><div>Hi Min,</div><div><br></div><div>It is really easy to run PETSc at ANL. I am sure one of us can help if you cannot reproduce this bug on your own.</div><div><br></div><div>  Thanks,</div><div><br></div><div>     Matt</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Thanks,<br>
Min<div><div class="h5"><br>
<br>
On 2018/03/27 15:38, Eric Chamberland wrote:<br>
</div></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div><div class="h5">
Hi,<br>
<br>
since more than 2 weeks that the master branch of mpich is still and it can be reproduced with a simple "make test" after a fresh installation of PETSc...<br>
<br>
Is anyone testing it?<br>
<br>
Is it supposed to be working?<br>
<br>
Just tell me if I should "follow" another mpich branch please.<br>
<br>
Thanks,<br>
<br>
Eric<br>
<br>
<br>
On 14/03/18 03:35 AM, Eric Chamberland wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Hi,<br>
<br>
fwiw, the actual mpich/master branch doesn't passes the PETSc "make test" after a fresh installation...  It hangs just afer the 1 MPI process test, meaning it is locked into the 2 process test:<br>
<br>
make PETSC_DIR=/pmi/cmpbib/compilat<wbr>ion_BIB_dernier_mpich/COMPILE_<wbr>AUTO/mpich-3.x-debug/petsc-3.<wbr>8.3-debug PETSC_ARCH=arch-linux2-c-debug test<br>
Running test examples to verify correct installation<br>
Using PETSC_DIR=/pmi/cmpbib/compilat<wbr>ion_BIB_dernier_mpich/COMPILE_<wbr>AUTO/mpich-3.x-debug/petsc-3.<wbr>8.3-debug and PETSC_ARCH=arch-linux2-c-debug<br>
C/C++ example src/snes/examples/tutorials/ex<wbr>19 run successfully with 1 MPI process<br>
<br>
<br>
<br>
<br>
^Cmakefile:151: recipe for target 'test' failed<br>
make: [test] Interrupt (ignored)<br>
<br>
thanks,<br>
<br>
Eric<br>
<br>
On 13/03/18 08:07 AM, Eric Chamberland wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<br>
Hi,<br>
<br>
each night we are testing mpich/master with our petsc-based code. I don't know if PETSc team is doing the same thing with mpich/master?   (Maybe it is a good idea?)<br>
<br>
Everything was fine (except the issue <a href="https://github.com/pmodels/mpich/issues/2892" rel="noreferrer" target="_blank">https://github.com/pmodels/mpi<wbr>ch/issues/2892</a>) up to commit 7b8d64debd, but since commit mpich:a8a2b30fd21), I have a segfault on a any parallel nightly test.<br>
<br>
For example, a 2 process test ends at almost different execution points:<br>
<br>
rank 0:<br>
<br>
#003: /lib64/libpthread.so.0(+0xf870<wbr>) [0x7f25bf908870]<br>
#004: /pmi/cmpbib/compilation_BIB_de<wbr>rnier_mpich/COMPILE_AUTO/BIB/<wbr>bin/BIBMEFGD.opt() [0x64a788]<br>
#005: /lib64/libc.so.6(+0x35140) [0x7f25bca18140]<br>
#006: /lib64/libc.so.6(__poll+0x2d) [0x7f25bcabfbfd]<br>
#007: /opt/mpich-3.x_debug/lib/libmp<wbr>i.so.0(+0x1e4cc9) [0x7f25bd90ccc9]<br>
#008: /opt/mpich-3.x_debug/lib/libmp<wbr>i.so.0(+0x1ea55c) [0x7f25bd91255c]<br>
#009: /opt/mpich-3.x_debug/lib/libmp<wbr>i.so.0(+0xba657) [0x7f25bd7e2657]<br>
#010: /opt/mpich-3.x_debug/lib/libmp<wbr>i.so.0(PMPI_Waitall+0xe3) [0x7f25bd7e3343]<br>
#011: /opt/petsc-3.8.3_debug_mpich-3<wbr>.x_debug/lib/libpetsc.so.3.8(P<wbr>etscGatherMessageLengths+0x654<wbr>) [0x7f25c4bb3193]<br>
#012: /opt/petsc-3.8.3_debug_mpich-3<wbr>.x_debug/lib/libpetsc.so.3.8(V<wbr>ecScatterCreate_PtoS+0x859) [0x7f25c4e82d7f]<br>
#013: /opt/petsc-3.8.3_debug_mpich-3<wbr>.x_debug/lib/libpetsc.so.3.8(V<wbr>ecScatterCreate+0x5684) [0x7f25c4e4d055]<br>
#014: /opt/petsc-3.8.3_debug_mpich-3<wbr>.x_debug/lib/libpetsc.so.3.8(V<wbr>ecCreateGhostWithArray+0x688) [0x7f25c4e01a39]<br>
#015: /opt/petsc-3.8.3_debug_mpich-3<wbr>.x_debug/lib/libpetsc.so.3.8(V<wbr>ecCreateGhost+0x179) [0x7f25c4e020f6]<br>
<br>
rank 1:<br>
<br>
#002: /pmi/cmpbib/compilation_BIB_de<wbr>rnier_mpich/COMPILE_AUTO/GIREF<wbr>/lib/libgiref_opt_Util.so(<wbr>traitementSignal+0x2bd0) [0x7f62df8e7310]<br>
#003: /lib64/libc.so.6(+0x35140) [0x7f62d3bc9140]<br>
#004: /lib64/libc.so.6(__poll+0x2d) [0x7f62d3c70bfd]<br>
#005: /opt/mpich-3.x_debug/lib/libmp<wbr>i.so.0(+0x1e4cc9) [0x7f62d4abdcc9]<br>
#006: /opt/mpich-3.x_debug/lib/libmp<wbr>i.so.0(+0x1ea55c) [0x7f62d4ac355c]<br>
#007: /opt/mpich-3.x_debug/lib/libmp<wbr>i.so.0(+0x12c9c5) [0x7f62d4a059c5]<br>
#008: /opt/mpich-3.x_debug/lib/libmp<wbr>i.so.0(+0x12e102) [0x7f62d4a07102]<br>
#009: /opt/mpich-3.x_debug/lib/libmp<wbr>i.so.0(+0xf17a1) [0x7f62d49ca7a1]<br>
#010: /opt/mpich-3.x_debug/lib/libmp<wbr>i.so.0(+0x3facf) [0x7f62d4918acf]<br>
#011: /opt/mpich-3.x_debug/lib/libmp<wbr>i.so.0(+0x3fc3d) [0x7f62d4918c3d]<br>
#012: /opt/mpich-3.x_debug/lib/libmp<wbr>i.so.0(+0xf18d8) [0x7f62d49ca8d8]<br>
#013: /opt/mpich-3.x_debug/lib/libmp<wbr>i.so.0(+0x3fb88) [0x7f62d4918b88]<br>
#014: /opt/mpich-3.x_debug/lib/libmp<wbr>i.so.0(+0x3fc3d) [0x7f62d4918c3d]<br>
#015: /opt/mpich-3.x_debug/lib/libmp<wbr>i.so.0(MPI_Barrier+0x27b) [0x7f62d4918edb]<br>
#016: /opt/petsc-3.8.3_debug_mpich-3<wbr>.x_debug/lib/libpetsc.so.3.8(P<wbr>etscCommGetNewTag+0x3ff) [0x7f62dbceb055]<br>
#017: /opt/petsc-3.8.3_debug_mpich-3<wbr>.x_debug/lib/libpetsc.so.3.8(P<wbr>etscObjectGetNewTag+0x15d) [0x7f62dbceaadb]<br>
#018: /opt/petsc-3.8.3_debug_mpich-3<wbr>.x_debug/lib/libpetsc.so.3.8(V<wbr>ecScatterCreateCommon_PtoS+0x1<wbr>ee) [0x7f62dc03625c]<br>
#019: /opt/petsc-3.8.3_debug_mpich-3<wbr>.x_debug/lib/libpetsc.so.3.8(V<wbr>ecScatterCreate_PtoS+0x29c4) [0x7f62dc035eea]<br>
#020: /opt/petsc-3.8.3_debug_mpich-3<wbr>.x_debug/lib/libpetsc.so.3.8(V<wbr>ecScatterCreate+0x5684) [0x7f62dbffe055]<br>
#021: /opt/petsc-3.8.3_debug_mpich-3<wbr>.x_debug/lib/libpetsc.so.3.8(V<wbr>ecCreateGhostWithArray+0x688) [0x7f62dbfb2a39]<br>
#022: /opt/petsc-3.8.3_debug_mpich-3<wbr>.x_debug/lib/libpetsc.so.3.8(V<wbr>ecCreateGhost+0x179) [0x7f62dbfb30f6]<br>
<br>
Have some other users (PETSc users?) reported problem?<br>
<br>
Thanks,<br>
<br>
Eric<br>
<br>
ps: usual informations:<br>
<br>
mpich logs:<br>
<a href="http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2018.03.12.05h39m54s_config.log" rel="noreferrer" target="_blank">http://www.giref.ulaval.ca/~cm<wbr>pgiref/dernier_mpich/2018.03.<wbr>12.05h39m54s_config.log</a> <br>
<a href="http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2018.03.12.05h39m54s_config.system" rel="noreferrer" target="_blank">http://www.giref.ulaval.ca/~cm<wbr>pgiref/dernier_mpich/2018.03.<wbr>12.05h39m54s_config.system</a> <br>
<a href="http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2018.03.12.05h39m54s_mpich_version.txt" rel="noreferrer" target="_blank">http://www.giref.ulaval.ca/~cm<wbr>pgiref/dernier_mpich/2018.03.<wbr>12.05h39m54s_mpich_version.txt</a> <br>
<a href="http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2018.03.12.05h39m54s_c.txt" rel="noreferrer" target="_blank">http://www.giref.ulaval.ca/~cm<wbr>pgiref/dernier_mpich/2018.03.<wbr>12.05h39m54s_c.txt</a> <br>
<a href="http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2018.03.12.05h39m54s_m.txt" rel="noreferrer" target="_blank">http://www.giref.ulaval.ca/~cm<wbr>pgiref/dernier_mpich/2018.03.<wbr>12.05h39m54s_m.txt</a> <br>
<a href="http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2018.03.12.05h39m54s_mi.txt" rel="noreferrer" target="_blank">http://www.giref.ulaval.ca/~cm<wbr>pgiref/dernier_mpich/2018.03.<wbr>12.05h39m54s_mi.txt</a> <br>
<a href="http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2018.03.12.05h39m54s_openmpa_config.log" rel="noreferrer" target="_blank">http://www.giref.ulaval.ca/~cm<wbr>pgiref/dernier_mpich/2018.03.<wbr>12.05h39m54s_openmpa_config.<wbr>log</a> <br>
<a href="http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2018.03.12.05h39m54s_mpl_config.log" rel="noreferrer" target="_blank">http://www.giref.ulaval.ca/~cm<wbr>pgiref/dernier_mpich/2018.03.<wbr>12.05h39m54s_mpl_config.log</a> <br>
<a href="http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2018.03.12.05h39m54s_pm_hydra_config.log" rel="noreferrer" target="_blank">http://www.giref.ulaval.ca/~cm<wbr>pgiref/dernier_mpich/2018.03.<wbr>12.05h39m54s_pm_hydra_config.<wbr>log</a> <br>
<a href="http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2018.03.12.05h39m54s_pm_hydra_tools_topo_config.log" rel="noreferrer" target="_blank">http://www.giref.ulaval.ca/~cm<wbr>pgiref/dernier_mpich/2018.03.<wbr>12.05h39m54s_pm_hydra_tools_<wbr>topo_config.log</a> <br>
<a href="http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2018.03.12.05h39m54s_mpiexec_info.txt" rel="noreferrer" target="_blank">http://www.giref.ulaval.ca/~cm<wbr>pgiref/dernier_mpich/2018.03.<wbr>12.05h39m54s_mpiexec_info.txt</a> <br>
<br>
Petsc logs:<br>
<a href="http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2018.03.12.05h39m54s_configure.log" rel="noreferrer" target="_blank">http://www.giref.ulaval.ca/~cm<wbr>pgiref/dernier_mpich/2018.03.<wbr>12.05h39m54s_configure.log</a> <br>
<a href="http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2018.03.12.05h39m54s_make.log" rel="noreferrer" target="_blank">http://www.giref.ulaval.ca/~cm<wbr>pgiref/dernier_mpich/2018.03.<wbr>12.05h39m54s_make.log</a> <br>
<a href="http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2018.03.12.05h39m54s_default.log" rel="noreferrer" target="_blank">http://www.giref.ulaval.ca/~cm<wbr>pgiref/dernier_mpich/2018.03.<wbr>12.05h39m54s_default.log</a> <br>
<a href="http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2018.03.12.05h39m54s_RDict.log" rel="noreferrer" target="_blank">http://www.giref.ulaval.ca/~cm<wbr>pgiref/dernier_mpich/2018.03.<wbr>12.05h39m54s_RDict.log</a> <br>
<a href="http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2018.03.12.05h39m54s_CMakeLists.txt" rel="noreferrer" target="_blank">http://www.giref.ulaval.ca/~cm<wbr>pgiref/dernier_mpich/2018.03.<wbr>12.05h39m54s_CMakeLists.txt</a> <br>
<br>
<br>
</blockquote>
<br>
</blockquote></div></div>
______________________________<wbr>_________________<br>
discuss mailing list     <a href="mailto:discuss@mpich.org" target="_blank">discuss@mpich.org</a><br>
To manage subscription options or unsubscribe:<br>
<a href="https://lists.mpich.org/mailman/listinfo/discuss" rel="noreferrer" target="_blank">https://lists.mpich.org/mailma<wbr>n/listinfo/discuss</a><br>
</blockquote>
<br>
</blockquote></div><br><br clear="all"><div><br></div>-- <br><div class="gmail_signature" data-smartmail="gmail_signature"><div dir="ltr"><div><div dir="ltr"><div>What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.<br>-- Norbert Wiener</div><div><br></div><div><a href="http://www.caam.rice.edu/~mk51/" target="_blank">https://www.cse.buffalo.edu/~knepley/</a><br></div></div></div></div></div>
</div></div>