[mpich-discuss] survived nodes

Harun Raşit ER harunrasiter at gmail.com
Thu Oct 28 10:31:18 CDT 2010


thank you all for your help..it does work :)

On Wed, Oct 27, 2010 at 10:19 PM, Darius Buntinas <buntinas at mcs.anl.gov>wrote:

> As Bill mentioned, the only way to implement a timeout is using an Irecv,
> then keep polling on the request till it completes or you time out, in which
> case you would cancel the request.
>
> MPICH2 1.3 has features to allow the application to survive process and
> communication failures.  Use the -disable-auto-cleanup flag for mpiexec to
> prevent it from killing your entire job when a process fails.  Then set the
> error handler to MPI_ERRORS_RETURN, so that the MPI functions will return an
> error code rather than aborting when a fault happens.  If you do this,
> you'll be able to continue communicating with non-failed processes.  The
> only catch is that you can't use collective operations on communicators that
> contain failed processes.
>
> FWIW, the MPI Forum is working on defining the behavior of the MPI library
> when faults occur.
>
> I hope this helps.
>
> -d
>
> On Oct 27, 2010, at 8:17 AM, Harun Raşit ER wrote:
>
> > I have 2 nodes. One of them sends a message to another and waits for
> reply. But the other node is not alive (may be the network is crashed). So I
> wanna wait for reply just for 3 seconds. After that, it will say that the
> other node is crashed and it will go on its task. But I cannot achieve this
> simple task:)
> >
> > please help!
> > _______________________________________________
> > mpich-discuss mailing list
> > mpich-discuss at mcs.anl.gov
> > https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
>
> _______________________________________________
> mpich-discuss mailing list
> mpich-discuss at mcs.anl.gov
> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20101028/7bc909c5/attachment.htm>


More information about the mpich-discuss mailing list