[mpich2-dev] [number.cruncher at ntlworld.com: [OMPI users] memcpy overlap in ompi_ddt_copy_content_same_ddt and glibc 2.12]

Dave Goodell goodell at mcs.anl.gov
Thu Nov 11 10:47:19 CST 2010


On Nov 11, 2010, at 10:16 AM CST, Rob Latham wrote:

> a glibc optimization is causing problems for openmpi.  is this going
> to be a problem for mpich2 ?

Thanks for the heads-up, but AFAIK this won't be a problem for MPICH2.  We don't use memcpy directly, but rather through a macro (MPIU_Memcpy).  For debug builds this allows us to insert extra error checking to assert that the source and destination ranges don't overlap:

https://trac.mcs.anl.gov/projects/mpich2/browser/mpich2/trunk/src/include/mpimem.h#L451
https://trac.mcs.anl.gov/projects/mpich2/browser/mpich2/trunk/src/include/mpiimpl.h#L126

The only place where this might be a problem is for some of the collectives if they are incorrectly called by the user.  Typically the only time we hit these memory overlap error cases are when the user passed src==dest into a collective and really should have used MPI_IN_PLACE instead, coupled with missing or disabled error checking inside the collective.  In most MPI implementations and with memcpy on most platforms, this ends up doing what the user expected even though it is a violation of memcpy's contract with the user.  But this is still a case of clear user error, so I don't exactly feel sorry for anyone who gets burned by it.

-Dave


> ==rob
> 
> ----- Forwarded message from Number Cruncher <number.cruncher at ntlworld.com> -----
> 
> Sender: users-bounces at open-mpi.org
> From: Number Cruncher <number.cruncher at ntlworld.com>
> Reply-To: Open MPI Users <users at open-mpi.org>
> Subject: [OMPI users] memcpy overlap in ompi_ddt_copy_content_same_ddt and
> 	glibc 2.12
> Date: Wed, 10 Nov 2010 17:11:47 +0000
> Message-ID: <4CDAD253.6010804 at ntlworld.com>
> User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US;
> 	rv:1.9.2.12) Gecko/20101103 Fedora/1.0-0.33.b2pre.fc14
> 	Thunderbird/3.1.6
> To: Open MPI Users <users at open-mpi.org>
> 
> Just some observations from a concerned user with a temperamental Open
> MPI program (1.4.3):
> 
> Fedora 14 (just released) includes glibc-2.12 which has optimized
> versions of memcpy, including a copy backward.
> http://sourceware.org/git/?p=glibc.git;a=commitdiff;h=6fb8cbcb58a29fff73eb2101b34caa19a7f88eba
> 
> I think the overlapping memcpy issue reported previously may be
> actively triggered by this optimized memcpy and produce incorrect
> data.
> 
> Related posts:
> http://www.open-mpi.org/community/lists/users/2009/04/9069.php
> http://www.open-mpi.org/community/lists/users/2008/01/4918.php
> http://www.open-mpi.org/community/lists/users/2007/08/3872.php
> 
> Original developer response:
> http://www.open-mpi.org/community/lists/users/2007/08/3873.php
> 
> Current ticket:
> https://svn.open-mpi.org/trac/ompi/ticket/1903
> 
> _______________________________________________
> users mailing list
> users at open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> ----- End forwarded message -----
> 
> -- 
> Rob Latham
> Mathematics and Computer Science Division
> Argonne National Lab, IL USA



More information about the mpich2-dev mailing list