[mpich-discuss] apparent hydra problem

Dave Goodell goodell at mcs.anl.gov
Fri Mar 2 14:40:20 CST 2012


On Mar 2, 2012, at 2:26 PM CST, Martin Pokorny wrote:

> Dave Goodell wrote:
>> This causes the temp context ID to collide with a context ID used by
>> an internal subcommunicator on half of the intercomm, and potentially
>> to collide with a random communicator on the other half.  So it's
>> possible to get some "cross talk" between two otherwise unrelated
>> communicators.
> 
> That's conceivably applicable in my case because the involved processes can be long-running, and threads with distinct communicators are employed to allow writing multiple files (using MPI-IO) concurrently. Is there some way I might be able to modify the MPIR_Intercomm_merge_impl code to test for a context ID collision (and then report this condition)?

Yes and no.  It wouldn't be hard to detect that you've intruded into some random comm's context ID space, but it would be very hard to detect that it was actually causing a problem.  Basically, the only crosstalk that can happen is between MPI_Allreduce operations and this communicator merge operation.  Are you making any allreduce calls?

-Dave




More information about the mpich-discuss mailing list