[mpich2-dev] mpich 1.1 beta: details of MPI_Win_fence semantics

Rajeev Thakur thakur at mcs.anl.gov
Tue Apr 14 12:53:22 CDT 2009


Doug,
     The text hasn't changed from 2.0, but 2.1 is the spec to refer to now.
It says

"The call completes an RMA access epoch if it was preceded by another fence
call and
the local process issued RMA communication calls on win between these two
calls. "

"The call starts an RMA access epoch if it is followed by another fence call
and by RMA communication calls issued between these two fence calls."

So only if the fence is followed by a put/get and another fence does it
start an epoch. And only if the fence was preceded by a fence and put/get
does it complete an epoch.

The asserts are not required, but may improve performance.

Changing sychronization mode is covered on pg 352 ln 12-19.

Rajeev


> -----Original Message-----
> From: mpich2-dev-bounces at mcs.anl.gov 
> [mailto:mpich2-dev-bounces at mcs.anl.gov] On Behalf Of Douglas Miller
> Sent: Tuesday, April 14, 2009 12:33 PM
> To: mpich2-dev at mcs.anl.gov
> Subject: Re: [mpich2-dev] mpich 1.1 beta: details of 
> MPI_Win_fence semantics
> 
> The text on pg 338 of the 2.1 specification is essentially 
> unchanged from
> earlier docs, and still seems too vague. It should state 
> explicitly that a
> single instance of a call to fence both ends the previous 
> epoch and starts
> a new epoch. But supporting this seems problematic.
> 
> So, you are saying that it is valid to do the following:
> 
> [fence - 0]
> [RMA operations]
> [fence - 0]
> [RMA operations]
> [fence - 0]
> [RMA operations]
> [fence - 0]
> 
> Does this mean that one is required to use NOPRECEDE and 
> NOSUCCEED in order
> to avoid RMA_SYNC errors when switching to/from another 
> synchronization
> methods after/before fence? Or else implementations must not do error
> checking for the synchronization primitives? This seems like 
> it's forcing
> low quality implementations.
> 
> In the new test "mixedsync" it does:
> 
> [lock]
> [RMA operations]
> [unlock]
> [fence - 0]
> [RMA operations]
> [fence - 0]
> <repeat>
> 
> In this case, the second fence would be followed by a lock 
> (looping back to
> the beginning).  Does this mean we must also allow lock-unlock while a
> fence epoch is active? It seems that there can be little or no error
> checking on synchronization primitives at all, or else a very (overly)
> complex internal state model is needed. In the above case, should the
> second fence start a "tentative" epoch which is then 
> released/converted
> when the lock happens? Or should it allow both 
> lock-rma-unlock and plain
> rma within the fence?
> 
> Is there better documentation of this interaction, or better examples
> showing how this should work?
> 
> thanks,
> doug miller
> 
> 
> 
>                                                               
>              
>              "Rajeev Thakur"                                  
>              
>              <thakur at mcs.anl.g                                
>              
>              ov>                                              
>           To 
>              Sent by:                  
> <mpich2-dev at mcs.anl.gov>            
>              mpich2-dev-bounce                                
>           cc 
>              s at mcs.anl.gov                                    
>              
>                                                               
>      Subject 
>                                        Re: [mpich2-dev] mpich 
> 1.1 beta:    
>              04/14/2009 10:40          details of 
> MPI_Win_fence semantics  
>              AM                                               
>              
>                                                               
>              
>                                                               
>              
>              Please respond to                                
>              
>              mpich2-dev at mcs.an                                
>              
>                    l.gov                                      
>              
>                                                               
>              
>                                                               
>              
> 
> 
> 
> 
> Doug,
>      A call to fence both completes the previous epoch (if 
> there was one)
> and starts the next epoch, as described on pg 338 of MPI 2.1. In other
> words, the sequence fence-put-fence-put-fence is allowed. 
> MPICH2 handles
> this case. That is why there are the asserts MPI_MODE_NOPRECEDE and
> MPI_MODE_NOSUCCEED for the user to indicate otherwise. Unless the user
> passes these asserts or it is the very first fence, the implementation
> should assume that a given fence can be preceded by puts/gets 
> and followed
> by puts/gets.
> 
> Rajeev
> 
> > -----Original Message-----
> > From: mpich2-dev-bounces at mcs.anl.gov
> > [mailto:mpich2-dev-bounces at mcs.anl.gov] On Behalf Of Douglas Miller
> > Sent: Tuesday, April 14, 2009 9:53 AM
> > To: mpich2-dev at mcs.anl.gov
> > Subject: [mpich2-dev] mpich 1.1 beta: details of
> > MPI_Win_fence semantics
> >
> >
> > Some new tests in mpich 1.1 beta use MPI_Win_fence in an
> > unexpected (to me)
> > fashion. They do the following:
> >
> > [fence - NOPRECEDE]
> > [RMA operations]
> > [fence - 0]
> > [RMA operations]
> > [fence - 0]
> > [RMA operations]
> > [fence - NOSUCCEED]
> >
> > I was assuming that MPI_Win_fence was *either* starting or
> > completing an
> > epoch (i.e. there were always matched-pairs of fence calls),
> > not both. But
> > this usage implies that there is an expectation that a single
> > call to fence
> > can *both* end an epoch and start a new epoch. The
> > specification is vague
> > at best.
> >
> > The problem with the above usage is that the middle calls to
> > fence create a
> > situation where the implementation cannot be certain whether it is
> > operating within a fence epoch or not. I'm not sure how to
> > implement any
> > sort of error checking to cover this case, as the user could
> > follow a fence
> > with either RMA calls or some other synchronization
> > primitives (POST-START
> > or LOCK) or even protected local access. It was my
> > understanding that the
> > ASSERT flags were meant to be hints to the implementation and
> > not required
> > by the caller for proper operation.
> >
> > Can you help clear this up?  Is the test wrong or are we
> > actually required
> > to handle this situation?
> >
> > thanks,
> >
> > doug miller
> >
> >
> 
> 
> 
> 



More information about the mpich2-dev mailing list