[mpich-discuss] failed processes

Darius Buntinas buntinas at mcs.anl.gov
Thu Nov 4 10:34:18 CDT 2010


To set the error handler, do this after MPI_Init:
    MPI_Comm_set_errhandler(MPI_COMM_WORLD, MPI_ERRORS_RETURN);
This will set the error handler for MPI_COMM_WORLD and any communicators created from it.  You can read more about error handling in Section 8.3 of the MPI-2.2 standard: http://www.mpi-forum.org/docs/docs.html

Jayesh:  Do you know how to disable the "auto-cleanup" feature in smpd?

-d

On Nov 4, 2010, at 1:34 AM, Harun Raşit ER wrote:

> Darius Thanks for your help. I am using Windows platform and new to MPI. So I don't know how to pass the "-disable-auto-cleanup" to mpiexec. How can i do that? Can you explain it and send a simple sample code about setting MPI_ERRORS_RETURN?
> 
> On Wed, Nov 3, 2010 at 6:29 PM, Darius Buntinas <buntinas at mcs.anl.gov> wrote:
> 
> Hi Harun,
> 
> If you use MPICH2 1.3, and pass the -disable-auto-cleanup parameter to mpiexec, then your app will not automatically be killed when a process dies before calling MPI_Finalize.  You'll then need to set the default error handler in MPI to MPI_ERRORS_RETURN, so that the application won't abort when an error is detected.
> 
> The MPICH2 library should allow you to continue communicating with other processes if a process dies.  However, collective operations on a communicator that includes a dead process will most likely hang some processes.
> 
> I hope this helps.
> 
> -d
> 
> On Nov 3, 2010, at 4:17 AM, Harun Raşit ER wrote:
> 
> > When one of the processes is failed, all my job is aborted. But there must be a solution that i cannot find! I would like to continue without the failed process and do the job with remaining processes. Is there any idea or solution?
> >
> > thanks for your helps.
> > _______________________________________________
> > mpich-discuss mailing list
> > mpich-discuss at mcs.anl.gov
> > https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
> 
> _______________________________________________
> mpich-discuss mailing list
> mpich-discuss at mcs.anl.gov
> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
> 
> _______________________________________________
> mpich-discuss mailing list
> mpich-discuss at mcs.anl.gov
> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss



More information about the mpich-discuss mailing list