[petsc-dev] potential bug with MPI_Win_fence() in openmpi-1.8.4

Satish Balay balay at mcs.anl.gov
Thu Apr 30 14:13:15 CDT 2015


Thanks for checking and getting a more appropriate fix in.

I've just tried this out - and the PETSc test code runs fine with it.

BTW: There is one inconsistancy in ompi/datatype/ompi_datatype_args.c
[that I noticed] - that you might want to check.
Perhaps the second line should be  "(DC) * sizeof(MPI_Datatype)"?

>>>>>>>>>
        int length = sizeof(ompi_datatype_args_t) + (IC) * sizeof(int) + \
            (AC) * sizeof(OPAL_PTRDIFF_TYPE) + (DC) * sizeof(MPI_Datatype); \


       pArgs->total_pack_size = (4 + (IC)) * sizeof(int) +             \
            (AC) * sizeof(OPAL_PTRDIFF_TYPE) + (DC) * sizeof(int);      \
<<<<<<<<<<<

Satish


On Thu, 30 Apr 2015, Matthew Knepley wrote:

> On Fri, May 1, 2015 at 4:55 AM, Jeff Squyres (jsquyres) <jsquyres at cisco.com>
> wrote:
> 
> > Thank you!
> >
> > George reviewed your patch and adjusted it a bit.  We applied it to master
> > and it's pending to the release series (v1.8.x).
> >
> 
> Was this identified by IBM?
> 
> 
> https://github.com/open-mpi/ompi/commit/015d3f56cf749ee5ad9ea4428d2f5da72f9bbe08
> 
>      Matt
> 
> 
> > Would you mind testing a nightly master snapshot?  It should be in
> > tonight's build:
> >
> >     http://www.open-mpi.org/nightly/master/
> >
> >
> >
> > > On Apr 30, 2015, at 12:50 AM, Satish Balay <balay at mcs.anl.gov> wrote:
> > >
> > > OpenMPI developers,
> > >
> > > We've had issues (memory errors) with OpenMPI - and code in PETSc
> > > library that uses MPI_Win_fence().
> > >
> > > Vagrind shows memory corruption deep inside OpenMPI function stack.
> > >
> > > I'm attaching a potential patch that appears to fix this issue for us.
> > > [the corresponding valgrind trace is listed in the patch header]
> > >
> > > Perhaps there is a more appropriate fix for this memory corruption. Could
> > > you check on this?
> > >
> > > [Sorry I don't have a pure MPI test code to demonstrate this error -
> > > but a PETSc test example code consistantly reproduces this issue]
> > >
> > > Thanks,
> > > Satish<openmpi-1.8.4.patch>
> >
> >
> > --
> > Jeff Squyres
> > jsquyres at cisco.com
> > For corporate legal information go to:
> > http://www.cisco.com/web/about/doing_business/legal/cri/
> >
> >
> 
> 
> 




More information about the petsc-dev mailing list