[mpich-discuss] Using MPI_Put/Get correctly?

James Dinan dinan at mcs.anl.gov
Thu Dec 16 14:27:39 CST 2010


Hi Matt,

If my understanding is correct, the only time you are allowed to perform 
direct load/store accesses on local data that is exposed in a window is 
when the window is closed under active target or when you are in an 
exclusive access epoch under passive mode target.  So I think what you 
are doing may be invalid even though you are able to guarantee that 
accesses do not overlap.  The source for your put will need to be a 
private buffer, you may be able to accomplish this easily in your code 
or you might have to copy data into a private buffer (before you post 
the window) before you can put().

Even though this is outside of the standard, some (many?) MPI 
implementations may actually allow this on cache-coherent systems (I 
think MPICH2 on shared memory will allow it).

I would be surprised if this error is causing your seg fault (more 
likely it should just result in corrupted data within the bounds of your 
buffer).  I would tend to suspect that something is off in your 
datatype, possibly the target datatype since the segfault occurs in 
wait() which is when data might be getting unpacked at the target.  Can 
you run your code through a debugger or valgrind to give us more 
information on how/when the seg faul occurs?

Cheers,
  ~Jim.

On 12/16/2010 12:33 PM, Grismer, Matthew J Civ USAF AFMC AFRL/RBAT wrote:
> I am trying to modify the communication routines in our code to use
> MPI_Put's instead of sends and receives.  This worked fine for several
> variable Put's, but now I have one that is causing seg faults. Reading
> through the MPI documentation it is not clear to me if what I am doing
> is permissible or not.  Basically, the question is this - if I have
> defined all of an array as a window on each processor, can I PUT data
> from that array to remote processes at the same time as the remote
> processes are PUTing into the local copy, assuming no overlaps of any of
> the PUTs?
>
> Here are the details if that doesn't make sense.  I have a (Fortran)
> array QF(6,2,N) on each processor, where N could be a very large number
> (100,000). I create a window QFWIN on the entire array on all the
> processors.  I define MPI_Type_indexed "sending" datatypes (QFSND) with
> block lengths of 6 that send from QF(1,1,*), and MPI_Type_indexed
> "receiving" datatypes (QFREC) with block lengths of 6 the receive into
> QF(1,2,*).  Here * is non-repeating set of integers up to N.  I create
> groups of processors that communicate, where these groups will all
> exchange QF data, PUTing local QF(1,1,*) to remote QF(1,2,*).  So,
> processor 1 is PUTing QF data to processors 2,3,4 at the same time 2,3,4
> are putting their QF data to 1, and so on.  Processors 2,3,4 are PUTing
> into non-overlapping regions of QF(1,2,*) on 1, and 1 is PUTing from
> QF(1,1,*) to 2,3,4, and so on.  So, my calls look like this on each
> processor:
>
> assertion = 0
> call MPI_Win_post(group, assertion, QFWIN, ierr)
> call MPI_Win_start(group, assertion, QFWIN, ierr)
>
> do I=1,neighbors
>    call MPI_Put(QF, 1, QFSND(I), NEIGHBOR(I), 0, 1, QFREC(I), QFWIN,
> ierr)
> end do
>
> call MPI_Win_complete(QFWIN,ierr)
> call MPI_Win_wait(QFWIN,ierr)
>
> Note I did define QFREC locally on each processor to properly represent
> where the data was going on the remote processors.  The error value
> ierr=0 after MPI_Win_post, MPI_Win_start, MPI_Put, and MPI_Win_complete,
> and the code seg faults in MPI_Win_wait.
>
> I'm using MPICH2 1.3.1 on Mac OS X 10.6.5, built with Intel XE (12.0)
> compilers, and running on just 2 (internal) processors of my Mac Pro.
> The code ran normally with this configuration up until the point I put
> the above in.  Several other communications with MPI_Put similar to the
> above work fine, though the windows are only on a subset of the
> communicated array, and the origin data is being PUT from part of the
> array that is not within the window.
>
> _____________________________________________________
> Matt
>
> _______________________________________________
> mpich-discuss mailing list
> mpich-discuss at mcs.anl.gov
> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss



More information about the mpich-discuss mailing list