[MPICH2-dev] handling fatal errors
David Gingold
david.gingold at sicortex.com
Fri Aug 5 15:51:57 CDT 2005
In an MPICH2 device implementation, what is the right way to handle
fatal errors that cannot easily be attributed to a calling function?
Possible examples of this:
- An asynchronous progress thread attempts to allocate memory
but fails.
- Resource allocation fails in code that was triggered by a user
MPI call, but that is not particularly related to that call.
- A similar failure happens in a place where it would be too
awkward or costly to include code to pass the error back to the user.
I spotted a few examples of this sort of thing in the MPICH2 code:
MPID_Abort(MPIR_Process.comm_world, MPIR_Err_create_code
(...), ...);
but I'm not sure whether doing this crosses into the realm of
undesirability.
-dg
--
David Gingold
Principal Software Engineer
SiCortex
One Clock Tower Place, Suite 100
Maynard MA 01754
(978) 897-0214 x224
More information about the mpich2-dev
mailing list