[mpich-discuss] survived nodes

Bill Rankin Bill.Rankin at sas.com
Wed Oct 27 09:54:27 CDT 2010


Do an MPI_Irecv() and then check back at a later point in time to see if the receive has completed via MPI_test().  If it has not completed, assume that something happened on the far end and you won't be getting a reply.

But also realize that in many cases, crashing a single MPI process may very well cause *all* the processes to abort.

Good luck,

-b


From: mpich-discuss-bounces at mcs.anl.gov [mailto:mpich-discuss-bounces at mcs.anl.gov] On Behalf Of Harun Rasit ER
Sent: Wednesday, October 27, 2010 9:17 AM
To: mpich-discuss at mcs.anl.gov
Subject: [mpich-discuss] survived nodes

I have 2 nodes. One of them sends a message to another and waits for reply. But the other node is not alive (may be the network is crashed). So I wanna wait for reply just for 3 seconds. After that, it will say that the other node is crashed and it will go on its task. But I cannot achieve this simple task:)

please help!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20101027/96e0daa4/attachment-0001.htm>


More information about the mpich-discuss mailing list