[mpich-discuss] failed processes

Harun Raşit ER harunrasiter at gmail.com
Thu Nov 4 01:34:33 CDT 2010


Darius Thanks for your help. I am using Windows platform and new to MPI. So
I don't know how to pass the "-disable-auto-cleanup" to mpiexec. How can i
do that? Can you explain it and send a simple sample code about setting
MPI_ERRORS_RETURN?

On Wed, Nov 3, 2010 at 6:29 PM, Darius Buntinas <buntinas at mcs.anl.gov>wrote:

>
> Hi Harun,
>
> If you use MPICH2 1.3, and pass the -disable-auto-cleanup parameter to
> mpiexec, then your app will not automatically be killed when a process dies
> before calling MPI_Finalize.  You'll then need to set the default error
> handler in MPI to MPI_ERRORS_RETURN, so that the application won't abort
> when an error is detected.
>
> The MPICH2 library should allow you to continue communicating with other
> processes if a process dies.  However, collective operations on a
> communicator that includes a dead process will most likely hang some
> processes.
>
> I hope this helps.
>
> -d
>
> On Nov 3, 2010, at 4:17 AM, Harun Raşit ER wrote:
>
> > When one of the processes is failed, all my job is aborted. But there
> must be a solution that i cannot find! I would like to continue without the
> failed process and do the job with remaining processes. Is there any idea or
> solution?
> >
> > thanks for your helps.
> > _______________________________________________
> > mpich-discuss mailing list
> > mpich-discuss at mcs.anl.gov
> > https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
>
> _______________________________________________
> mpich-discuss mailing list
> mpich-discuss at mcs.anl.gov
> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20101104/e65a738f/attachment.htm>


More information about the mpich-discuss mailing list