[mpich-discuss] intercommunicator support in MPICH
Rob Latham
robl at mcs.anl.gov
Tue Jul 20 12:41:44 CDT 2010
(please keep Jim cc'ed on followups, thanks)
On Tue, Jul 20, 2010 at 11:32:16AM -0500, Dave Goodell wrote:
> Intercommunicators are definitely supported in MPICH2. You probably
> have MPICH installed instead, which does not support
> intercommunicators (nor is it supported in general).
Jim does explicitly mention the Cray. Any chance that Jaguar is
running some old version of MPICH2 with a shoddy intercommunicator
support?
Jim is also coming from AIX: do you know of anything about the IBM
intercommunicator support that might make the transition to MPICH2
odd? (due to, say, defects in either the IBM or MPICH2
implementation: as we know, the standard is one thing but
implementations have varying degrees of "quality")
> Point-to-point performance in intercommunicators should generally be
> identical to performance in intracommunicators. Collective
> communication routines for intercommunicators have not been
> extensively tuned, so they may not quite perform as well as they
> could, depending on the particular collective and way it is invoked.
Well there you have it, Jim: it's supposed to "just work". Perhaps
you can tell us a bit more about how you are creating the
intercommunicators and how you are using them?
==rob
>
> On Jul 20, 2010, at 8:05 AM CDT, Rob Latham wrote:
>
> > Hi Jim. I'm interested in hearing more about how this async i/o
> > strategy plays out on other platforms.
> >
> > I'm moving this to the mpich-discuss list, because as far as I know
> > intercommunicators are supported on MPICH2, but the folks on the
> > mpich-discuss list will be able to speak with more authority on that
> > matter.
> >
> > What is it about intercommunicators that does not work for you? Are
> > you splitting up COMM_WORLD to form comp_comm and io_comm ?
> >
> > There might be performance implications with intercommunicators. Can
> > the link between the two sets be the bottleneck here? I presume you
> > are transferring a lot of data to io_comm.
> >
> > MPICH guys, Jim's original email is below.
> > ==rob
> >
> > On Mon, Jul 19, 2010 at 04:44:50PM -0600, Jim Edwards wrote:
> >> Hi All,
> >>
> >> I have created a new repository branch and checked in the beginnings of a
> >> version of pio which allows the io tasks to be a disjoint set of tasks from
> >> those used for computation.
> >>
> >> The io_comm and the comp_comm are disjoint and pio_init
> >> is called with an intercommunicator which spans the two task sets. The
> >> compute task set returns while the io task set waits in a call back loop for
> >> further instructions.
> >>
> >> I have added three new tests in the pio test suite and all of them pass on
> >> bluefire. Then I discovered that the mpich does not support mpi
> >> intercommunicators. These are part of the mpi-2 standard and I thought
> >> that all of the mpi implementations were there by now? Apparently not. Is
> >> there another mpi implementation that we can try on jaguar or edinburgh?
> >>
> >> Currently all of the pio commands are still syncronous calls - that is the
> >> compute tasks cannot continue until the write has completed, my eventual
> >> plan is to relax this requirement to see if there is a performance advantage
> >> - but if AIX-POE is the only environment to support this model I may have to
> >> rethink the approach.
> >>
> >> If you get a chance please have a look at the implementation in
> >> https://parallelio.googlecode.com/svn/branches/async_pio1_1_1/
> >>
> >> If enough of you are interested we can schedule a con-call to go over how it
> >> works and some of the things that still need to be done.
> >>
> >> Jim
> >>
> >
> > --
> > Rob Latham
> > Mathematics and Computer Science Division
> > Argonne National Lab, IL USA
> > _______________________________________________
> > mpich-discuss mailing list
> > mpich-discuss at mcs.anl.gov
> > https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
>
> _______________________________________________
> mpich-discuss mailing list
> mpich-discuss at mcs.anl.gov
> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
--
Rob Latham
Mathematics and Computer Science Division
Argonne National Lab, IL USA
More information about the mpich-discuss
mailing list