[petsc-users] MPI Error Handling Inside PETSc

Bland, Wesley wesley.bland at intel.com
Fri Jun 8 10:58:23 CDT 2018


Hi PETSc folks,

In the MPI Forum, we're getting close to adopting a proposal to change the default communicator where errors are raised when they don't have anywhere else to go (think something like MPI_ALLOC_MEM, which doesn't have a communicator). Instead of getting those errors on the error handler for MPI_COMM_WORLD, it would move to MPI_COMM_SELF to allow more local error handling. Rather than having an error for passing an invalid argument potentially cause all processes to trigger the error handler, only the local process would see it.

This doesn't impact normal error handling, such as if an MPI_RECV fails for some reason. That would trigger the error handler attached to the communicator in the receive. However, this is potentially a backward incompatible change due to the fact that people might be changing the error handler of MPI_COMM_WORLD, but not MPI_COMM_SELF.

The details can be found on this<https://github.com/mpi-forum/mpi-issues/issues/3> GitHub issue and in this<https://github.com/mpi-forum/mpi-issues/files/1511327/issues-1-3-markedup.pdf> PDF (search for ticket3).

I see that in PETSc, you guys do some basic error handling by changing the default error handler to MPI_ERRORS_RETURN on PETSC_COMM_WORLD, which may or may not be equal to MPI_COMM_WORLD. So in this case, I believe that in order to get the same error handling in all possible (though unlikely) cases, you would also need to set the same error handler on MPI_COMM_SELF. Probably something along the lines of:

MPI_Comm_set_errhandler(MPI_COMM_SELF, MPI_ERRORS_RETURN);

Or, if you want to preserve the user error handler:

MPI_Comm_get_errhandler(MPI_COMM_SELF, &orig_errhandler);
if (orig_errhandler != MPI_ERRORS_ARE_FATAL) {
    /* Create custom error handler to deal with internal
     * PETSc errors and then call the user's error handler */
}

Before we vote this in next week, we wanted to reach out to some users to see if you have strong opinions about this. Despite the fact that this will have some impact on users, we think this is the right way to go to improve error management in the MPI Standard (there are other efforts going on if you're interested).

Thanks,
Wesley Bland
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20180608/f5b6e811/attachment.html>


More information about the petsc-users mailing list