[mpich-discuss] disable-auto-cleanup send/receive example

Rob Stewart robstewart57 at googlemail.com
Sat Nov 5 08:13:16 CDT 2011


On 11/03/2011 07:47 PM, Darius Buntinas wrote:
>
> In MPICH2, only already posted wildcard receives will be completed with an error when a failure is detected.  You should be able to post wildcard receives after the failure is detected.

OK, yes I agree with the approach taken in mpich2 for this. I do not 
understand fully why MPI would not accept future 
MPI_Recv(_,_,_,MPI_ANY_SOURCE,comm,_) unless the process manager knew 
for certain that the list of ranks in MPI_ANY_SOURCE was empty.

I've come to try the behaviour you describe for such cases in mpich2, 
but I am unable to reconcile your description of the mpich2 behaviour.

Every node in my example below forks the mpi_recv(buf,size) function, 
which as you can see, loops forever. This means that every node is 
listening for messages, and when they arrive - do something useful with 
the buffer.

Here's the code:

http://pastebin.com/n1bPWztx

However, the combination of sleeps and MPI_Iprobe's doesn't seem to be 
enough. Instead, I kill a process at some point during the execution, 
and roughly every 6 seconds every node reports to stdout:

6: MPI_Recv failed
4: MPI_Recv failed
9: MPI_Recv failed
..

Is this not what you would have expected, or is there a reason for the 
MPI_Recv with MPI_ANY_SOURCE failing every time, following a killed process?

thanks,

-- 
Rob Stewart


More information about the mpich-discuss mailing list