[mpich-discuss] disable-auto-cleanup send/receive example

Darius Buntinas buntinas at mcs.anl.gov
Mon Nov 7 10:22:26 CST 2011


OK, then there's likely a bug in MPICH.  I'm working on implementing the features described in the MPI Forum proposal.  I'll move the features related to anysouce receives to the top.  I can send you a tarball to try once I have something for you to try.

-d


On Nov 5, 2011, at 8:13 AM, Rob Stewart wrote:

> On 11/03/2011 07:47 PM, Darius Buntinas wrote:
>> 
>> In MPICH2, only already posted wildcard receives will be completed with an error when a failure is detected.  You should be able to post wildcard receives after the failure is detected.
> 
> OK, yes I agree with the approach taken in mpich2 for this. I do not understand fully why MPI would not accept future MPI_Recv(_,_,_,MPI_ANY_SOURCE,comm,_) unless the process manager knew for certain that the list of ranks in MPI_ANY_SOURCE was empty.
> 
> I've come to try the behaviour you describe for such cases in mpich2, but I am unable to reconcile your description of the mpich2 behaviour.
> 
> Every node in my example below forks the mpi_recv(buf,size) function, which as you can see, loops forever. This means that every node is listening for messages, and when they arrive - do something useful with the buffer.
> 
> Here's the code:
> 
> http://pastebin.com/n1bPWztx
> 
> However, the combination of sleeps and MPI_Iprobe's doesn't seem to be enough. Instead, I kill a process at some point during the execution, and roughly every 6 seconds every node reports to stdout:
> 
> 6: MPI_Recv failed
> 4: MPI_Recv failed
> 9: MPI_Recv failed
> ..
> 
> Is this not what you would have expected, or is there a reason for the MPI_Recv with MPI_ANY_SOURCE failing every time, following a killed process?
> 
> thanks,
> 
> -- 
> Rob Stewart



More information about the mpich-discuss mailing list