[mpich-discuss] Don't crash on node failures

Pavan Balaji balaji at mcs.anl.gov
Wed Apr 14 19:34:01 CDT 2010


On 04/14/2010 03:13 AM, Jürgen Kaiser wrote:
> Can I force MPI to not abort the whole job when a node crashes? I would
> like to let the remaining MPI-processes perform some action in that case
> and then proceed.

This support is not currently available in MPICH2, but we are actively 
working on it. We hope to have this in the 1.3 release of mpich2, though 
it's possible that it might get delayed to the next major version.

  -- Pavan

-- 
Pavan Balaji
http://www.mcs.anl.gov/~balaji


More information about the mpich-discuss mailing list