[mpich-discuss] Fwd: [mpich2-maint] Relative ordering of MPI_Iprobe()s and MPI_Barrier()s
Dave Goodell
goodell at mcs.anl.gov
Tue Aug 17 10:47:57 CDT 2010
[moving Edric's message over to mpich-discuss@ before responding]
Begin forwarded message:
> From: Edric Ellis <Edric.Ellis at mathworks.co.uk>
> Date: August 17, 2010 10:41:08 AM CDT
> To: "mpich2-maint at mcs.anl.gov" <mpich2-maint at mcs.anl.gov>
> Subject: [mpich2-maint] Relative ordering of MPI_Iprobe()s and MPI_Barrier()s
>
> Hi mpich2-maint,
>
> We’re in the process of moving to MPICH2-1.2.1p1 using the SMPD/nemesis variant (on Linux only - on Windows, we’re waiting for a fix to ticket #895), and we’ve found a discrepancy in the behaviour compared to the sock variant. I’m not sure if this is a real bug, or if I’ve missed something in the MPI standard. Our test for our wrapper around MPI_Barrier() essentially proceeds as follows (see attached for a C test case which usually shows this problem when running with 10 processes). Each process does this:
>
> 1. Call MPI_Send() to each other process in turn with a tiny payload (assuming that this will be sent in the “eager” mode).
> 2. MPI_Barrier()
> 3. Check that MPI_Iprobe() indicates a message ready to receive from each other process
>
> With the sock variant, this works as I expect - each process gets a return from MPI_Iprobe indicating that there is indeed a message waiting from each other process. With nemesis, this isn’t always the case - sometimes multiple calls to MPI_Iprobe are required. (Could this be related to ticket #1062?).
>
> I couldn’t see in the MPI standard where the “expected” behaviour of the above might be specified, but it’s possible that I’ve missed something.
>
> I can see several options for where a problem might exist:
>
> 1. MPI doesn’t actually specify that these MPI_Iprobe()s should definitely return “true”
> 2. The nemesis channel isn’t preserving the ordering between MPI_Barrier() and pt2pt communications in the way I expect
>
> As it happens, our usage of MPI_Iprobe() is basically restricted to our test code, so we could modify our tests not to rely on the old behaviour, but we’d like to understand better where the problem is.
>
> Cheers,
>
> Edric.
>
> _______________________________________________
> mpich2-maint mailing list
> mpich2-maint at lists.mcs.anl.gov
> https://lists.mcs.anl.gov/mailman/listinfo/mpich2-maint
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20100817/6ed3e353/attachment.htm>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: testprobe.cpp
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20100817/6ed3e353/attachment.diff>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20100817/6ed3e353/attachment-0001.htm>
More information about the mpich-discuss
mailing list