[mpich-discuss] Progress for user-defined non-blocking collectives

Dave Goodell goodell at mcs.anl.gov
Tue Feb 21 18:04:54 CST 2012


On Feb 21, 2012, at 4:56 PM CST, Jed Brown wrote:

> Dave, it looks like you implemented the non-blocking collectives for MPI-3. How reasonable do you think it would be to expose enough hooks to be able to write a user-defined non-blocking collective that could make useful progress?

I'm leery of doing this.  There are a reasonable number of assumptions in the NBC impl code that it really is only being used for internal nonblocking collectives which would make this sort of thing a bit of a pain.  But I don't think the technical issues are exceptionally difficult, just not really pleasant either.

> This comes up pretty frequently where there is a high-level operation with collective semantics that internally needs to perform multiple dependent communications. We might still expose a non-blocking interface to the user, but the performance benefit is limited because the multiple rounds either take place up-front or at the end.

OK, I believe I understand your use case and why you want this.  It seems that part of the problem is that you have chosen to expose a nonblocking interface but you don't want either (A) a background thread or (B) to require the callers of your interface to make intermediate progress calls to your library.

> Generalized requests aren't (currently) a good solution because (from what I can tell), they only make progress when _that request_ is polled. In practice, you want to poke those requests from other library calls (or, eventually/on some systems, by a progress thread), just like MPI native operations can make progress without explicitly being polled.


I just skimmed the generalized req code and the lack of implicit progress does seem to be the way it's been implemented.  I'm assuming you're thinking about the proposed generalized request extensions rather than the MPI-2 flavor, since they don't have a way for MPI to drive progress.

Generalized requests seem more appropriate to this scenario if we can ask MPI to implicitly make progress on it.  Something like a "MPIX_Register_grequest" routine.  Also we need to ensure that you are allowed to make MPI calls from within the poll_fn or wait_fn.  I'm not certain this is actually safe right now.

-Dave



More information about the mpich-discuss mailing list