[mpich-discuss] Relative ordering of MPI_Iprobe()s and MPI_Barrier()s

Edric Ellis Edric.Ellis at mathworks.co.uk
Tue Aug 17 10:59:20 CDT 2010


Hi mpich2-discuss,

We're in the process of moving to MPICH2-1.2.1p1 using the SMPD/nemesis variant (on Linux only - on Windows, we're waiting for a fix to ticket #895), and we've found a discrepancy in the behaviour compared to the sock variant. I'm not sure if this is a real bug, or if I've missed something in the MPI standard. Our test for our wrapper around MPI_Barrier() essentially proceeds as follows (see attached for a C test case which usually shows this problem when running with 10 processes). Each process does this:


1.       Call MPI_Send() to each other process in turn with a tiny payload (assuming that this will be sent in the "eager" mode).

2.       MPI_Barrier()

3.       Check that MPI_Iprobe() indicates a message ready to receive from each other process

With the sock variant, this works as I expect - each process gets a return from MPI_Iprobe indicating that there is indeed a message waiting from each other process. With nemesis, this isn't always the case - sometimes multiple calls to MPI_Iprobe are required. (Could this be related to ticket #1062?).

I couldn't see in the MPI standard where the "expected" behaviour of the above might be specified, but it's possible that I've missed something.

I can see several options for where a problem might exist:


1.       MPI doesn't actually specify that these MPI_Iprobe()s should definitely return "true"

2.       The nemesis channel isn't preserving the ordering between MPI_Barrier() and pt2pt communications in the way I expect

As it happens, our usage of MPI_Iprobe() is basically restricted to our test code, so we could modify our tests not to rely on the old behaviour, but we'd like to understand better where the problem is.

Cheers,

Edric.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20100817/a1cb40b8/attachment.htm>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: testprobe.cpp
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20100817/a1cb40b8/attachment.diff>


More information about the mpich-discuss mailing list