[mpich-discuss] Relative ordering of MPI_Iprobe()s and MPI_Barrier()s

Vo, Anh vtqanh at gmail.com
Tue Aug 17 18:35:11 CDT 2010


I think as far as the standard is concerned, Dave is correct here. If you
assume that MPI_Send is sent in "eager" mode (i.e., buffered), there's no
guarantee that the data has even moved out of the senders' buffer after the
MPI_Barrier. Thus, the MPI_Iprobe calls are not guaranteed to return true.
If you busy wait on the MPI_Iprobe, the standard requires that they return
true (since the sends have been initiated).

In the unlikely case where the eager limit is 0 (i.e., no local buffer for
MPI_Send), your program will deadlock.

-Anh
On 8/17/10 9:06 AM, "Dave Goodell" <goodell at mcs.anl.gov> wrote:

> Hi Edric,
> 
> My understanding of the MPI Standard is that case (1) is true.  That is,
> point-to-point and collective communication occur in separate contexts and
> don't interfere with each other, except insofar as many pt2pt and all
> collective calls may block a process.  See MPI-2.2 p.133:18 and p.188:43.  So
> while the MPI_Barrier in your example does ensure that the MPI_Send calls have
> been posted, it doesn't say anything about when the probing processes will
> begin to "see" those sends.
> 
> The MPI standard does require that a busy-waiting MPI_Iprobe loop will
> eventually see the incoming message (MPI-2.2 p.66:24).  But this seems to be
> the behavior that you are seeing, so I think we are conforming to the standard
> here.
> 
> I suspect that the implementation reason for the behavior you are seeing in
> nemesis is that we don't poll the network (TCP in this case) as often as
> shared memory when we are making progress.  But I haven't played with your
> example code at all yet.
> 
> BTW, your test program could theoretically deadlock.  If your MPI_Send calls
> blocked until the receiver arrived at some sort of reception call (Recv,
> Probe, etc), then all of your processes would be stuck in MPI_Send on line 15.
> MPI_Isend is a safer way to code this.  "Eager" sending is not required by the
> MPI standard, even though it is extremely common practice.
> 
> -Dave
> 
> On Aug 17, 2010, at 10:59 AM CDT, Edric Ellis wrote:
> 
>> Hi mpich2-discuss,
>>  
>> We¹re in the process of moving to MPICH2-1.2.1p1 using the SMPD/nemesis
>> variant (on Linux only - on Windows, we¹re waiting for a fix to ticket #895),
>> and we¹ve found a discrepancy in the behaviour compared to the sock variant.
>> I¹m not sure if this is a real bug, or if I¹ve missed something in the MPI
>> standard. Our test for our wrapper around MPI_Barrier() essentially proceeds
>> as follows (see attached for a C test case which usually shows this problem
>> when running with 10 processes). Each process does this:
>>  
>> 1.       Call MPI_Send() to each other process in turn with a tiny payload
>> (assuming that this will be sent in the ³eager² mode).
>> 2.       MPI_Barrier()
>> 3.       Check that MPI_Iprobe() indicates a message ready to receive from
>> each other process
>>  
>> With the sock variant, this works as I expect - each process gets a return
>> from MPI_Iprobe indicating that there is indeed a message waiting from each
>> other process. With nemesis, this isn¹t always the case - sometimes multiple
>> calls to MPI_Iprobe are required. (Could this be related to ticket #1062?).
>>  
>> I couldn¹t see in the MPI standard where the ³expected² behaviour of the
>> above might be specified, but it¹s possible that I¹ve missed something.
>>  
>> I can see several options for where a problem might exist:
>>  
>> 1.       MPI doesn¹t actually specify that these MPI_Iprobe()s should
>> definitely return ³true²
>> 2.       The nemesis channel isn¹t preserving the ordering between
>> MPI_Barrier() and pt2pt communications in the way I expect
>>  
>> As it happens, our usage of MPI_Iprobe() is basically restricted to our test
>> code, so we could modify our tests not to rely on the old behaviour, but we¹d
>> like to understand better where the problem is.
>>  
>> Cheers,
>>  
>> Edric.
>>  
>> <testprobe.cpp>_______________________________________________
>> mpich-discuss mailing list
>> mpich-discuss at mcs.anl.gov
>> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
> 
> _______________________________________________
> mpich-discuss mailing list
> mpich-discuss at mcs.anl.gov
> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss




More information about the mpich-discuss mailing list