[mpich-discuss] disable-auto-cleanup send/receive example
Rob Stewart
robstewart57 at googlemail.com
Sat Nov 5 08:13:16 CDT 2011
On 11/03/2011 07:47 PM, Darius Buntinas wrote:
>
> In MPICH2, only already posted wildcard receives will be completed with an error when a failure is detected. You should be able to post wildcard receives after the failure is detected.
OK, yes I agree with the approach taken in mpich2 for this. I do not
understand fully why MPI would not accept future
MPI_Recv(_,_,_,MPI_ANY_SOURCE,comm,_) unless the process manager knew
for certain that the list of ranks in MPI_ANY_SOURCE was empty.
I've come to try the behaviour you describe for such cases in mpich2,
but I am unable to reconcile your description of the mpich2 behaviour.
Every node in my example below forks the mpi_recv(buf,size) function,
which as you can see, loops forever. This means that every node is
listening for messages, and when they arrive - do something useful with
the buffer.
Here's the code:
http://pastebin.com/n1bPWztx
However, the combination of sleeps and MPI_Iprobe's doesn't seem to be
enough. Instead, I kill a process at some point during the execution,
and roughly every 6 seconds every node reports to stdout:
6: MPI_Recv failed
4: MPI_Recv failed
9: MPI_Recv failed
..
Is this not what you would have expected, or is there a reason for the
MPI_Recv with MPI_ANY_SOURCE failing every time, following a killed process?
thanks,
--
Rob Stewart
More information about the mpich-discuss
mailing list