[mpich-discuss] Assertion failure from too many MPI_Gets between fences

Fri Jan 7 17:55:28 CST 2011

On Jan 7, 2011, at 5:26 PM CST, Jeremiah Willcock wrote:

> On Fri, 7 Jan 2011, Dave Goodell wrote:
> 
>> MPI-2.2, page 339, line 13-14: "These operations are nonblocking: the call initiates the transfer, but the transfer may continue after the call returns."
>> 
>> This language is weaker than I would like, because the presence of the clarifying statements after the colon don't say that the call cannot block, implicitly watering down the natural MPI meaning of "nonblocking".  But I think that the intent is clear, that the call should not block the user waiting on the action of another process. After further thought I can't come up with any realistic example where a blocking-for-flow-control MPI_Get causes a deadlock, but I think the behavior is still intended to be disallowed by the standard.
> 
> I think that the progress clarification at the top of page 371 of MPI 2.2 (end of section 11.7.2) would cover the case in which some one-sided operations blocked for flow control.  Or could there be deadlocks even with MPI progress semantics?

As I said, I couldn't come up with a _realistic_ program where this would result in a deadlock.  But an unrealistic program is exactly the sort of thing that is discussed in the second paragraph of that Rationale passage.  Something ridiculous like:

-----8<-----
if (rank == 0) {
   for (1..1000000) {
       MPI_Get(..., /*rank=*/1, ...);
   }
   send_on_socket_to_rank_1(...);
   MPI_Win_fence(...);
}
else {
   /* do some compute or even nothing here */
   blocking_socket_recv_from_rank_0(...);
   MPI_Win_fence(...);
}
-----8<-----

Under my "blocking for flow control is not allowed" interpretation, the user could assume this program won't deadlock.  Under the opposing interpretation it easily could if the implementation does not provide asynchronous progress (which is a valid and common implementation choice).

Instead of a socket send/recv pair, you could stick any sort of non-MPI synchronizing operation in there.  Shared memory barriers or mutexes, UNIX FIFOs, some sort of non-MPI file I/O, etc.  I don't consider any of these cases to be practical or realistic MPI programs, but they do illustrate the point.

Now, that all said, we could probably offer "block user RMA calls for flow control" as some sort of non-standard-compliant option that you could turn on via an environment variable in MPICH2.

-Dave