<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Mon, Jul 25, 2016 at 12:44 PM, Eric Chamberland <span dir="ltr"><<a href="mailto:Eric.Chamberland@giref.ulaval.ca" target="_blank">Eric.Chamberland@giref.ulaval.ca</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Ok,<br>
<br>
here is the 2 points answered:<br>
<br>
#1) got valgrind output... here is the fatal free operation:<br></blockquote><div><br></div><div>Okay, this is not the MatMult scatter, this is for local representations of ghosted vectors. However, to me</div><div>it looks like OpenMPI mistakenly frees its built-in type for MPI_DOUBLE.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
==107156== Invalid free() / delete / delete[] / realloc()<br>
==107156==    at 0x4C2A37C: free (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)<br>
==107156==    by 0x1E63CD5F: opal_free (malloc.c:184)<br>
==107156==    by 0x27622627: mca_pml_ob1_recv_request_fini (pml_ob1_recvreq.h:133)<br>
==107156==    by 0x27622C4F: mca_pml_ob1_recv_request_free (pml_ob1_recvreq.c:90)<br>
==107156==    by 0x1D3EF9DC: ompi_request_free (request.h:362)<br>
==107156==    by 0x1D3EFAD5: PMPI_Request_free (prequest_free.c:59)<br>
==107156==    by 0x14AE3B9C: VecScatterDestroy_PtoP (vpscat.c:219)<br>
==107156==    by 0x14ADEB74: VecScatterDestroy (vscat.c:1860)<br>
==107156==    by 0x14A8D426: VecDestroy_MPI (pdvec.c:25)<br>
==107156==    by 0x14A33809: VecDestroy (vector.c:432)<br>
==107156==    by 0x10A2A5AB: GIREFVecDestroy(_p_Vec*&) (girefConfigurationPETSc.h:115)<br>
==107156==    by 0x10BA9F14: VecteurPETSc::detruitObjetPETSc() (VecteurPETSc.cc:2292)<br>
==107156==    by 0x10BA9D0D: VecteurPETSc::~VecteurPETSc() (VecteurPETSc.cc:287)<br>
==107156==    by 0x10BA9F48: VecteurPETSc::~VecteurPETSc() (VecteurPETSc.cc:281)<br>
==107156==    by 0x1135A57B: PPReactionsAppuiEL3D::~PPReactionsAppuiEL3D() (PPReactionsAppuiEL3D.cc:216)<br>
==107156==    by 0xCD9A1EA: ProblemeGD::~ProblemeGD() (in /home/mefpp_ericc/depots_prepush/GIREF/lib/libgiref_dev_Formulation.so)<br>
==107156==    by 0x435702: main (Test.ProblemeGD.icc:381)<br>
==107156==  Address 0x1d6acbc0 is 0 bytes inside data symbol "ompi_mpi_double"<br>
--107156-- REDIR: 0x1dda2680 (libc.so.6:__GI_stpcpy) redirected to 0x4c2f330 (__GI_stpcpy)<br>
==107156==<br>
==107156== Process terminating with default action of signal 6 (SIGABRT): dumping core<br>
==107156==    at 0x1DD520C7: raise (in /lib64/<a href="http://libc-2.19.so" rel="noreferrer" target="_blank">libc-2.19.so</a>)<br>
==107156==    by 0x1DD53534: abort (in /lib64/<a href="http://libc-2.19.so" rel="noreferrer" target="_blank">libc-2.19.so</a>)<br>
==107156==    by 0x1DD4B145: __assert_fail_base (in /lib64/<a href="http://libc-2.19.so" rel="noreferrer" target="_blank">libc-2.19.so</a>)<br>
==107156==    by 0x1DD4B1F1: __assert_fail (in /lib64/<a href="http://libc-2.19.so" rel="noreferrer" target="_blank">libc-2.19.so</a>)<br>
==107156==    by 0x27626D12: mca_pml_ob1_send_request_fini (pml_ob1_sendreq.h:221)<br>
==107156==    by 0x276274C9: mca_pml_ob1_send_request_free (pml_ob1_sendreq.c:117)<br>
==107156==    by 0x1D3EF9DC: ompi_request_free (request.h:362)<br>
==107156==    by 0x1D3EFAD5: PMPI_Request_free (prequest_free.c:59)<br>
==107156==    by 0x14AE3C3C: VecScatterDestroy_PtoP (vpscat.c:225)<br>
==107156==    by 0x14ADEB74: VecScatterDestroy (vscat.c:1860)<br>
==107156==    by 0x14A8D426: VecDestroy_MPI (pdvec.c:25)<br>
==107156==    by 0x14A33809: VecDestroy (vector.c:432)<br>
==107156==    by 0x10A2A5AB: GIREFVecDestroy(_p_Vec*&) (girefConfigurationPETSc.h:115)<br>
==107156==    by 0x10BA9F14: VecteurPETSc::detruitObjetPETSc() (VecteurPETSc.cc:2292)<br>
==107156==    by 0x10BA9D0D: VecteurPETSc::~VecteurPETSc() (VecteurPETSc.cc:287)<br>
==107156==    by 0x10BA9F48: VecteurPETSc::~VecteurPETSc() (VecteurPETSc.cc:281)<br>
==107156==    by 0x1135A57B: PPReactionsAppuiEL3D::~PPReactionsAppuiEL3D() (PPReactionsAppuiEL3D.cc:216)<br>
==107156==    by 0xCD9A1EA: ProblemeGD::~ProblemeGD() (in /home/mefpp_ericc/depots_prepush/GIREF/lib/libgiref_dev_Formulation.so)<br>
==107156==    by 0x435702: main (Test.ProblemeGD.icc:381)<br>
<br>
<br>
#2) For the run with -vecscatter_alltoall it works...!<br>
<br>
As an "end user", should I ever modify these VecScatterCreate options? How do they change the performances of the code on large problems?<br></blockquote><div><br></div><div>Yep, those options are there because the different variants are better on different architectures, and you can't know which one to pick until runtime,</div><div>(and without experimentation).</div><div><br></div><div>  Thanks,</div><div><br></div><div>    Matt</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Thanks,<br>
<br>
Eric<br>
<br>
On 25/07/16 02:57 PM, Matthew Knepley wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
On Mon, Jul 25, 2016 at 11:33 AM, Eric Chamberland<br>
<<a href="mailto:Eric.Chamberland@giref.ulaval.ca" target="_blank">Eric.Chamberland@giref.ulaval.ca</a><br>
<mailto:<a href="mailto:Eric.Chamberland@giref.ulaval.ca" target="_blank">Eric.Chamberland@giref.ulaval.ca</a>>> wrote:<br>
<br>
    Hi,<br>
<br>
    has someone tried OpenMPI 2.0 with Petsc 3.7.2?<br>
<br>
    I am having some errors with petsc, maybe someone have them too?<br>
<br>
    Here are the configure logs for PETSc:<br>
<br>
    <a href="http://www.giref.ulaval.ca/~cmpgiref/dernier_ompi/2016.07.25.01h16m02s_configure.log" rel="noreferrer" target="_blank">http://www.giref.ulaval.ca/~cmpgiref/dernier_ompi/2016.07.25.01h16m02s_configure.log</a><br>
<br>
    <a href="http://www.giref.ulaval.ca/~cmpgiref/dernier_ompi/2016.07.25.01h16m02s_RDict.log" rel="noreferrer" target="_blank">http://www.giref.ulaval.ca/~cmpgiref/dernier_ompi/2016.07.25.01h16m02s_RDict.log</a><br>
<br>
    And for OpenMPI:<br>
    <a href="http://www.giref.ulaval.ca/~cmpgiref/dernier_ompi/2016.07.25.01h16m02s_config.log" rel="noreferrer" target="_blank">http://www.giref.ulaval.ca/~cmpgiref/dernier_ompi/2016.07.25.01h16m02s_config.log</a><br>
<br>
    (in fact, I am testing the ompi-release branch, a sort of<br>
    petsc-master branch, since I need the commit 9ba6678156).<br>
<br>
    For a set of parallel tests, I have 104 that works on 124 total tests.<br>
<br>
<br>
It appears that the fault happens when freeing the VecScatter we build<br>
for MatMult, which contains Request structures<br>
for the ISends and  IRecvs. These looks like internal OpenMPI errors to<br>
me since the Request should be opaque.<br>
I would try at least two things:<br>
<br>
1) Run under valgrind.<br>
<br>
2) Switch the VecScatter implementation. All the options are here,<br>
<br>
  <a href="http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Vec/VecScatterCreate.html#VecScatterCreate" rel="noreferrer" target="_blank">http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Vec/VecScatterCreate.html#VecScatterCreate</a><br>
<br>
but maybe use alltoall.<br>
<br>
  Thanks,<br>
<br>
     Matt<br>
<br>
<br>
    And the typical error:<br>
    *** Error in<br>
    `/pmi/cmpbib/compilation_BIB_dernier_ompi/COMPILE_AUTO/GIREF/bin/Test.ProblemeGD.dev':<br>
    free(): invalid pointer:<br>
    ======= Backtrace: =========<br>
    /lib64/libc.so.6(+0x7277f)[0x7f80eb11677f]<br>
    /lib64/libc.so.6(+0x78026)[0x7f80eb11c026]<br>
    /lib64/libc.so.6(+0x78d53)[0x7f80eb11cd53]<br>
    /opt/openmpi-2.x_opt/lib/libopen-pal.so.20(opal_free+0x1f)[0x7f80ea8f9d60]<br>
    /opt/openmpi-2.x_opt/lib/openmpi/mca_pml_ob1.so(+0x16628)[0x7f80df0ea628]<br>
    /opt/openmpi-2.x_opt/lib/openmpi/mca_pml_ob1.so(+0x16c50)[0x7f80df0eac50]<br>
    /opt/openmpi-2.x_opt/lib/libmpi.so.20(+0x9f9dd)[0x7f80eb7029dd]<br>
    /opt/openmpi-2.x_opt/lib/libmpi.so.20(MPI_Request_free+0xf7)[0x7f80eb702ad6]<br>
    /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x4adc6d)[0x7f80f2fa6c6d]<br>
    /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(VecScatterDestroy+0x68d)[0x7f80f2fa1c45]<br>
    /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0xa9d0f5)[0x7f80f35960f5]<br>
    /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(MatDestroy+0x648)[0x7f80f35c2588]<br>
    /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x10bf0f4)[0x7f80f3bb80f4]<br>
    /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCReset+0x346)[0x7f80f3a796de]<br>
    /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(KSPReset+0x502)[0x7f80f3d19779]<br>
    /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x11707f7)[0x7f80f3c697f7]<br>
    /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCReset+0x346)[0x7f80f3a796de]<br>
    /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(KSPReset+0x502)[0x7f80f3d19779]<br>
    /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x11707f7)[0x7f80f3c697f7]<br>
    /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCReset+0x346)[0x7f80f3a796de]<br>
    /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(KSPReset+0x502)[0x7f80f3d19779]<br>
    /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x11707f7)[0x7f80f3c697f7]<br>
    /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCReset+0x346)[0x7f80f3a796de]<br>
    /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCDestroy+0x5d1)[0x7f80f3a79fd9]<br>
    /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(KSPDestroy+0x7b6)[0x7f80f3d1a334]<br>
<br>
    a similar one:<br>
    *** Error in<br>
    `/pmi/cmpbib/compilation_BIB_dernier_ompi/COMPILE_AUTO/GIREF/bin/Test.ProbFluideIncompressible.dev':<br>
    free(): invalid pointer: 0x00007f382a7c5bc0 ***<br>
    ======= Backtrace: =========<br>
    /lib64/libc.so.6(+0x7277f)[0x7f3829f1c77f]<br>
    /lib64/libc.so.6(+0x78026)[0x7f3829f22026]<br>
    /lib64/libc.so.6(+0x78d53)[0x7f3829f22d53]<br>
    /opt/openmpi-2.x_opt/lib/libopen-pal.so.20(opal_free+0x1f)[0x7f38296ffd60]<br>
    /opt/openmpi-2.x_opt/lib/openmpi/mca_pml_ob1.so(+0x16628)[0x7f381deab628]<br>
    /opt/openmpi-2.x_opt/lib/openmpi/mca_pml_ob1.so(+0x16c50)[0x7f381deabc50]<br>
    /opt/openmpi-2.x_opt/lib/libmpi.so.20(+0x9f9dd)[0x7f382a5089dd]<br>
    /opt/openmpi-2.x_opt/lib/libmpi.so.20(MPI_Request_free+0xf7)[0x7f382a508ad6]<br>
    /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x4adc6d)[0x7f3831dacc6d]<br>
    /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(VecScatterDestroy+0x68d)[0x7f3831da7c45]<br>
    /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x9f4755)[0x7f38322f3755]<br>
    /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(MatDestroy+0x648)[0x7f38323c8588]<br>
    /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCReset+0x4e2)[0x7f383287f87a]<br>
    /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCDestroy+0x5d1)[0x7f383287ffd9]<br>
    /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(KSPDestroy+0x7b6)[0x7f3832b20334]<br>
<br>
    another one:<br>
<br>
    *** Error in<br>
    `/pmi/cmpbib/compilation_BIB_dernier_ompi/COMPILE_AUTO/GIREF/bin/Test.MortierDiffusion.dev':<br>
    free(): invalid pointer: 0x00007f67b6d37bc0 ***<br>
    ======= Backtrace: =========<br>
    /lib64/libc.so.6(+0x7277f)[0x7f67b648e77f]<br>
    /lib64/libc.so.6(+0x78026)[0x7f67b6494026]<br>
    /lib64/libc.so.6(+0x78d53)[0x7f67b6494d53]<br>
    /opt/openmpi-2.x_opt/lib/libopen-pal.so.20(opal_free+0x1f)[0x7f67b5c71d60]<br>
    /opt/openmpi-2.x_opt/lib/openmpi/mca_pml_ob1.so(+0x1adae)[0x7f67aa4cddae]<br>
    /opt/openmpi-2.x_opt/lib/openmpi/mca_pml_ob1.so(+0x1b4ca)[0x7f67aa4ce4ca]<br>
    /opt/openmpi-2.x_opt/lib/libmpi.so.20(+0x9f9dd)[0x7f67b6a7a9dd]<br>
    /opt/openmpi-2.x_opt/lib/libmpi.so.20(MPI_Request_free+0xf7)[0x7f67b6a7aad6]<br>
    /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x4adb09)[0x7f67be31eb09]<br>
    /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(VecScatterDestroy+0x68d)[0x7f67be319c45]<br>
    /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x4574f7)[0x7f67be2c84f7]<br>
    /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(VecDestroy+0x648)[0x7f67be26e8da]<br>
<br>
    I feel like I should wait until someone else from Petsc have tested<br>
    it too...<br>
<br>
    Thanks,<br>
<br>
    Eric<br>
<br>
<br>
<br><span class="HOEnZb"><font color="#888888">
<br>
--<br>
What most experimenters take for granted before they begin their<br>
experiments is infinitely more interesting than any results to which<br>
their experiments lead.<br>
-- Norbert Wiener<br>
</font></span></blockquote>
</blockquote></div><br><br clear="all"><div><br></div>-- <br><div class="gmail_signature" data-smartmail="gmail_signature">What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.<br>-- Norbert Wiener</div>
</div></div>