[MPICH] Possible Race condition between Test() and Cancel

David Minor david-m at orbotech.com
Tue Jan 31 01:22:11 CST 2006


There appears to be a problem with MPI_Cancel. At least under Red Hat 9
with the the g++ 3.4.3 compiler.

If you Cancel a completed receive request, you will get an MPI abort or
seg fault.
But if you Test() the request before calling cancel on it there is
always the possibility that between the Test() and the Cancel() the
request will be completed thus causing an abort.  What is the solution?
Shouldn't Cancel() simply return an error if the request is already
completed?

My specific problem is:

I'w waiting with WaitAll on a set of receive requests. I want to wait
until either 1) They all complete or 2) another thread decides to cancel
the requests.

The problem is that the thread that cancels the requests has no way of
assuring that it doesn't call Cancel() on an already completed request.

Please advise,

Regards,
David Minor Orbotech




More information about the mpich-discuss mailing list