[mpich-discuss] intercommunicator support in MPICH

Rob Latham robl at mcs.anl.gov
Tue Jul 20 12:41:44 CDT 2010


(please keep Jim cc'ed on followups, thanks)

On Tue, Jul 20, 2010 at 11:32:16AM -0500, Dave Goodell wrote:
> Intercommunicators are definitely supported in MPICH2.  You probably
> have MPICH installed instead, which does not support
> intercommunicators (nor is it supported in general). 

Jim does explicitly mention the Cray.  Any chance that Jaguar is
running some old version of MPICH2 with a shoddy intercommunicator
support?  

Jim is also coming from AIX: do you know of anything about the IBM
intercommunicator support that might make the transition to MPICH2
odd?  (due to, say, defects in either the IBM or MPICH2
implementation:  as we know, the standard is one thing but
implementations have varying degrees of "quality")

> Point-to-point performance in intercommunicators should generally be
> identical to performance in intracommunicators.  Collective
> communication routines for intercommunicators have not been
> extensively tuned, so they may not quite perform as well as they
> could, depending on the particular collective and way it is invoked.

Well there you have it, Jim: it's supposed to "just work".  Perhaps
you can tell us a bit more about how you are creating the
intercommunicators and how you are using them?

==rob

> 
> On Jul 20, 2010, at 8:05 AM CDT, Rob Latham wrote:
> 
> > Hi Jim.  I'm interested in hearing more about how this async i/o
> > strategy plays out on other platforms.  
> > 
> > I'm moving this to the mpich-discuss list, because as far as I know
> > intercommunicators are supported on MPICH2, but the folks on the
> > mpich-discuss list will be able to speak with more authority on that
> > matter.
> > 
> > What is it about intercommunicators that does not work for you?  Are
> > you splitting up COMM_WORLD to form comp_comm and io_comm ?   
> > 
> > There might be performance implications with intercommunicators.  Can
> > the link between the two sets be the bottleneck here?  I presume  you
> > are transferring a lot of data to io_comm.  
> > 
> > MPICH guys, Jim's original email is below. 
> > ==rob
> > 
> > On Mon, Jul 19, 2010 at 04:44:50PM -0600, Jim Edwards wrote:
> >> Hi All,
> >> 
> >> I have created a new repository branch and checked in the beginnings of a
> >> version of pio which allows the io tasks to be a disjoint set of tasks from
> >> those used for computation.
> >> 
> >> The io_comm and the comp_comm are disjoint and pio_init
> >> is called with an intercommunicator which spans the two task sets.   The
> >> compute task set returns while the io task set waits in a call back loop for
> >> further instructions.
> >> 
> >> I have added three new tests in the pio test suite and all of them pass on
> >> bluefire.   Then I discovered that the mpich  does not support mpi
> >> intercommunicators.    These are part of the mpi-2 standard and I thought
> >> that all of the mpi implementations were there by now?  Apparently not.   Is
> >> there another mpi implementation that we can try on jaguar or edinburgh?
> >> 
> >> Currently all of the pio commands are still syncronous calls - that is the
> >> compute tasks cannot continue until the write has completed, my eventual
> >> plan is to relax this requirement to see if there is a performance advantage
> >> - but if AIX-POE is the only environment to support this model I may have to
> >> rethink the approach.
> >> 
> >> If you get a chance please have a look at the implementation in
> >> https://parallelio.googlecode.com/svn/branches/async_pio1_1_1/
> >> 
> >> If enough of you are interested we can schedule a con-call to go over how it
> >> works and some of the things that still need to be done.
> >> 
> >> Jim
> >> 
> > 
> > -- 
> > Rob Latham
> > Mathematics and Computer Science Division
> > Argonne National Lab, IL USA
> > _______________________________________________
> > mpich-discuss mailing list
> > mpich-discuss at mcs.anl.gov
> > https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
> 
> _______________________________________________
> mpich-discuss mailing list
> mpich-discuss at mcs.anl.gov
> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss

-- 
Rob Latham
Mathematics and Computer Science Division
Argonne National Lab, IL USA


More information about the mpich-discuss mailing list