[mpich-discuss] disable-auto-cleanup send/receive example
Darius Buntinas
buntinas at mcs.anl.gov
Mon Nov 7 10:22:26 CST 2011
OK, then there's likely a bug in MPICH. I'm working on implementing the features described in the MPI Forum proposal. I'll move the features related to anysouce receives to the top. I can send you a tarball to try once I have something for you to try.
-d
On Nov 5, 2011, at 8:13 AM, Rob Stewart wrote:
> On 11/03/2011 07:47 PM, Darius Buntinas wrote:
>>
>> In MPICH2, only already posted wildcard receives will be completed with an error when a failure is detected. You should be able to post wildcard receives after the failure is detected.
>
> OK, yes I agree with the approach taken in mpich2 for this. I do not understand fully why MPI would not accept future MPI_Recv(_,_,_,MPI_ANY_SOURCE,comm,_) unless the process manager knew for certain that the list of ranks in MPI_ANY_SOURCE was empty.
>
> I've come to try the behaviour you describe for such cases in mpich2, but I am unable to reconcile your description of the mpich2 behaviour.
>
> Every node in my example below forks the mpi_recv(buf,size) function, which as you can see, loops forever. This means that every node is listening for messages, and when they arrive - do something useful with the buffer.
>
> Here's the code:
>
> http://pastebin.com/n1bPWztx
>
> However, the combination of sleeps and MPI_Iprobe's doesn't seem to be enough. Instead, I kill a process at some point during the execution, and roughly every 6 seconds every node reports to stdout:
>
> 6: MPI_Recv failed
> 4: MPI_Recv failed
> 9: MPI_Recv failed
> ..
>
> Is this not what you would have expected, or is there a reason for the MPI_Recv with MPI_ANY_SOURCE failing every time, following a killed process?
>
> thanks,
>
> --
> Rob Stewart
More information about the mpich-discuss
mailing list