[petsc-users] OpenMPI 2.0 and Petsc 3.7.2
Matthew Knepley
knepley at gmail.com
Mon Jul 25 14:53:32 CDT 2016
On Mon, Jul 25, 2016 at 12:44 PM, Eric Chamberland <
Eric.Chamberland at giref.ulaval.ca> wrote:
> Ok,
>
> here is the 2 points answered:
>
> #1) got valgrind output... here is the fatal free operation:
>
Okay, this is not the MatMult scatter, this is for local representations of
ghosted vectors. However, to me
it looks like OpenMPI mistakenly frees its built-in type for MPI_DOUBLE.
> ==107156== Invalid free() / delete / delete[] / realloc()
> ==107156== at 0x4C2A37C: free (in
> /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
> ==107156== by 0x1E63CD5F: opal_free (malloc.c:184)
> ==107156== by 0x27622627: mca_pml_ob1_recv_request_fini
> (pml_ob1_recvreq.h:133)
> ==107156== by 0x27622C4F: mca_pml_ob1_recv_request_free
> (pml_ob1_recvreq.c:90)
> ==107156== by 0x1D3EF9DC: ompi_request_free (request.h:362)
> ==107156== by 0x1D3EFAD5: PMPI_Request_free (prequest_free.c:59)
> ==107156== by 0x14AE3B9C: VecScatterDestroy_PtoP (vpscat.c:219)
> ==107156== by 0x14ADEB74: VecScatterDestroy (vscat.c:1860)
> ==107156== by 0x14A8D426: VecDestroy_MPI (pdvec.c:25)
> ==107156== by 0x14A33809: VecDestroy (vector.c:432)
> ==107156== by 0x10A2A5AB: GIREFVecDestroy(_p_Vec*&)
> (girefConfigurationPETSc.h:115)
> ==107156== by 0x10BA9F14: VecteurPETSc::detruitObjetPETSc()
> (VecteurPETSc.cc:2292)
> ==107156== by 0x10BA9D0D: VecteurPETSc::~VecteurPETSc()
> (VecteurPETSc.cc:287)
> ==107156== by 0x10BA9F48: VecteurPETSc::~VecteurPETSc()
> (VecteurPETSc.cc:281)
> ==107156== by 0x1135A57B: PPReactionsAppuiEL3D::~PPReactionsAppuiEL3D()
> (PPReactionsAppuiEL3D.cc:216)
> ==107156== by 0xCD9A1EA: ProblemeGD::~ProblemeGD() (in
> /home/mefpp_ericc/depots_prepush/GIREF/lib/libgiref_dev_Formulation.so)
> ==107156== by 0x435702: main (Test.ProblemeGD.icc:381)
> ==107156== Address 0x1d6acbc0 is 0 bytes inside data symbol
> "ompi_mpi_double"
> --107156-- REDIR: 0x1dda2680 (libc.so.6:__GI_stpcpy) redirected to
> 0x4c2f330 (__GI_stpcpy)
> ==107156==
> ==107156== Process terminating with default action of signal 6 (SIGABRT):
> dumping core
> ==107156== at 0x1DD520C7: raise (in /lib64/libc-2.19.so)
> ==107156== by 0x1DD53534: abort (in /lib64/libc-2.19.so)
> ==107156== by 0x1DD4B145: __assert_fail_base (in /lib64/libc-2.19.so)
> ==107156== by 0x1DD4B1F1: __assert_fail (in /lib64/libc-2.19.so)
> ==107156== by 0x27626D12: mca_pml_ob1_send_request_fini
> (pml_ob1_sendreq.h:221)
> ==107156== by 0x276274C9: mca_pml_ob1_send_request_free
> (pml_ob1_sendreq.c:117)
> ==107156== by 0x1D3EF9DC: ompi_request_free (request.h:362)
> ==107156== by 0x1D3EFAD5: PMPI_Request_free (prequest_free.c:59)
> ==107156== by 0x14AE3C3C: VecScatterDestroy_PtoP (vpscat.c:225)
> ==107156== by 0x14ADEB74: VecScatterDestroy (vscat.c:1860)
> ==107156== by 0x14A8D426: VecDestroy_MPI (pdvec.c:25)
> ==107156== by 0x14A33809: VecDestroy (vector.c:432)
> ==107156== by 0x10A2A5AB: GIREFVecDestroy(_p_Vec*&)
> (girefConfigurationPETSc.h:115)
> ==107156== by 0x10BA9F14: VecteurPETSc::detruitObjetPETSc()
> (VecteurPETSc.cc:2292)
> ==107156== by 0x10BA9D0D: VecteurPETSc::~VecteurPETSc()
> (VecteurPETSc.cc:287)
> ==107156== by 0x10BA9F48: VecteurPETSc::~VecteurPETSc()
> (VecteurPETSc.cc:281)
> ==107156== by 0x1135A57B: PPReactionsAppuiEL3D::~PPReactionsAppuiEL3D()
> (PPReactionsAppuiEL3D.cc:216)
> ==107156== by 0xCD9A1EA: ProblemeGD::~ProblemeGD() (in
> /home/mefpp_ericc/depots_prepush/GIREF/lib/libgiref_dev_Formulation.so)
> ==107156== by 0x435702: main (Test.ProblemeGD.icc:381)
>
>
> #2) For the run with -vecscatter_alltoall it works...!
>
> As an "end user", should I ever modify these VecScatterCreate options? How
> do they change the performances of the code on large problems?
>
Yep, those options are there because the different variants are better on
different architectures, and you can't know which one to pick until runtime,
(and without experimentation).
Thanks,
Matt
> Thanks,
>
> Eric
>
> On 25/07/16 02:57 PM, Matthew Knepley wrote:
>
>> On Mon, Jul 25, 2016 at 11:33 AM, Eric Chamberland
>> <Eric.Chamberland at giref.ulaval.ca
>> <mailto:Eric.Chamberland at giref.ulaval.ca>> wrote:
>>
>> Hi,
>>
>> has someone tried OpenMPI 2.0 with Petsc 3.7.2?
>>
>> I am having some errors with petsc, maybe someone have them too?
>>
>> Here are the configure logs for PETSc:
>>
>>
>> http://www.giref.ulaval.ca/~cmpgiref/dernier_ompi/2016.07.25.01h16m02s_configure.log
>>
>>
>> http://www.giref.ulaval.ca/~cmpgiref/dernier_ompi/2016.07.25.01h16m02s_RDict.log
>>
>> And for OpenMPI:
>>
>> http://www.giref.ulaval.ca/~cmpgiref/dernier_ompi/2016.07.25.01h16m02s_config.log
>>
>> (in fact, I am testing the ompi-release branch, a sort of
>> petsc-master branch, since I need the commit 9ba6678156).
>>
>> For a set of parallel tests, I have 104 that works on 124 total tests.
>>
>>
>> It appears that the fault happens when freeing the VecScatter we build
>> for MatMult, which contains Request structures
>> for the ISends and IRecvs. These looks like internal OpenMPI errors to
>> me since the Request should be opaque.
>> I would try at least two things:
>>
>> 1) Run under valgrind.
>>
>> 2) Switch the VecScatter implementation. All the options are here,
>>
>>
>> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Vec/VecScatterCreate.html#VecScatterCreate
>>
>> but maybe use alltoall.
>>
>> Thanks,
>>
>> Matt
>>
>>
>> And the typical error:
>> *** Error in
>>
>> `/pmi/cmpbib/compilation_BIB_dernier_ompi/COMPILE_AUTO/GIREF/bin/Test.ProblemeGD.dev':
>> free(): invalid pointer:
>> ======= Backtrace: =========
>> /lib64/libc.so.6(+0x7277f)[0x7f80eb11677f]
>> /lib64/libc.so.6(+0x78026)[0x7f80eb11c026]
>> /lib64/libc.so.6(+0x78d53)[0x7f80eb11cd53]
>>
>> /opt/openmpi-2.x_opt/lib/libopen-pal.so.20(opal_free+0x1f)[0x7f80ea8f9d60]
>>
>> /opt/openmpi-2.x_opt/lib/openmpi/mca_pml_ob1.so(+0x16628)[0x7f80df0ea628]
>>
>> /opt/openmpi-2.x_opt/lib/openmpi/mca_pml_ob1.so(+0x16c50)[0x7f80df0eac50]
>> /opt/openmpi-2.x_opt/lib/libmpi.so.20(+0x9f9dd)[0x7f80eb7029dd]
>>
>> /opt/openmpi-2.x_opt/lib/libmpi.so.20(MPI_Request_free+0xf7)[0x7f80eb702ad6]
>>
>> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x4adc6d)[0x7f80f2fa6c6d]
>>
>> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(VecScatterDestroy+0x68d)[0x7f80f2fa1c45]
>>
>> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0xa9d0f5)[0x7f80f35960f5]
>>
>> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(MatDestroy+0x648)[0x7f80f35c2588]
>>
>> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x10bf0f4)[0x7f80f3bb80f4]
>>
>> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCReset+0x346)[0x7f80f3a796de]
>>
>> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(KSPReset+0x502)[0x7f80f3d19779]
>>
>> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x11707f7)[0x7f80f3c697f7]
>>
>> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCReset+0x346)[0x7f80f3a796de]
>>
>> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(KSPReset+0x502)[0x7f80f3d19779]
>>
>> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x11707f7)[0x7f80f3c697f7]
>>
>> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCReset+0x346)[0x7f80f3a796de]
>>
>> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(KSPReset+0x502)[0x7f80f3d19779]
>>
>> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x11707f7)[0x7f80f3c697f7]
>>
>> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCReset+0x346)[0x7f80f3a796de]
>>
>> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCDestroy+0x5d1)[0x7f80f3a79fd9]
>>
>> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(KSPDestroy+0x7b6)[0x7f80f3d1a334]
>>
>> a similar one:
>> *** Error in
>>
>> `/pmi/cmpbib/compilation_BIB_dernier_ompi/COMPILE_AUTO/GIREF/bin/Test.ProbFluideIncompressible.dev':
>> free(): invalid pointer: 0x00007f382a7c5bc0 ***
>> ======= Backtrace: =========
>> /lib64/libc.so.6(+0x7277f)[0x7f3829f1c77f]
>> /lib64/libc.so.6(+0x78026)[0x7f3829f22026]
>> /lib64/libc.so.6(+0x78d53)[0x7f3829f22d53]
>>
>> /opt/openmpi-2.x_opt/lib/libopen-pal.so.20(opal_free+0x1f)[0x7f38296ffd60]
>>
>> /opt/openmpi-2.x_opt/lib/openmpi/mca_pml_ob1.so(+0x16628)[0x7f381deab628]
>>
>> /opt/openmpi-2.x_opt/lib/openmpi/mca_pml_ob1.so(+0x16c50)[0x7f381deabc50]
>> /opt/openmpi-2.x_opt/lib/libmpi.so.20(+0x9f9dd)[0x7f382a5089dd]
>>
>> /opt/openmpi-2.x_opt/lib/libmpi.so.20(MPI_Request_free+0xf7)[0x7f382a508ad6]
>>
>> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x4adc6d)[0x7f3831dacc6d]
>>
>> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(VecScatterDestroy+0x68d)[0x7f3831da7c45]
>>
>> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x9f4755)[0x7f38322f3755]
>>
>> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(MatDestroy+0x648)[0x7f38323c8588]
>>
>> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCReset+0x4e2)[0x7f383287f87a]
>>
>> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCDestroy+0x5d1)[0x7f383287ffd9]
>>
>> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(KSPDestroy+0x7b6)[0x7f3832b20334]
>>
>> another one:
>>
>> *** Error in
>>
>> `/pmi/cmpbib/compilation_BIB_dernier_ompi/COMPILE_AUTO/GIREF/bin/Test.MortierDiffusion.dev':
>> free(): invalid pointer: 0x00007f67b6d37bc0 ***
>> ======= Backtrace: =========
>> /lib64/libc.so.6(+0x7277f)[0x7f67b648e77f]
>> /lib64/libc.so.6(+0x78026)[0x7f67b6494026]
>> /lib64/libc.so.6(+0x78d53)[0x7f67b6494d53]
>>
>> /opt/openmpi-2.x_opt/lib/libopen-pal.so.20(opal_free+0x1f)[0x7f67b5c71d60]
>>
>> /opt/openmpi-2.x_opt/lib/openmpi/mca_pml_ob1.so(+0x1adae)[0x7f67aa4cddae]
>>
>> /opt/openmpi-2.x_opt/lib/openmpi/mca_pml_ob1.so(+0x1b4ca)[0x7f67aa4ce4ca]
>> /opt/openmpi-2.x_opt/lib/libmpi.so.20(+0x9f9dd)[0x7f67b6a7a9dd]
>>
>> /opt/openmpi-2.x_opt/lib/libmpi.so.20(MPI_Request_free+0xf7)[0x7f67b6a7aad6]
>>
>> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x4adb09)[0x7f67be31eb09]
>>
>> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(VecScatterDestroy+0x68d)[0x7f67be319c45]
>>
>> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x4574f7)[0x7f67be2c84f7]
>>
>> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(VecDestroy+0x648)[0x7f67be26e8da]
>>
>> I feel like I should wait until someone else from Petsc have tested
>> it too...
>>
>> Thanks,
>>
>> Eric
>>
>>
>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which
>> their experiments lead.
>> -- Norbert Wiener
>>
>
--
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160725/5823a5ac/attachment.html>
More information about the petsc-users
mailing list