[mpich-discuss] intercommunicator support in MPICH

Rob Latham robl at mcs.anl.gov
Tue Jul 20 08:05:38 CDT 2010


Hi Jim.  I'm interested in hearing more about how this async i/o
strategy plays out on other platforms.  

I'm moving this to the mpich-discuss list, because as far as I know
intercommunicators are supported on MPICH2, but the folks on the
mpich-discuss list will be able to speak with more authority on that
matter.

What is it about intercommunicators that does not work for you?  Are
you splitting up COMM_WORLD to form comp_comm and io_comm ?   

There might be performance implications with intercommunicators.  Can
the link between the two sets be the bottleneck here?  I presume  you
are transferring a lot of data to io_comm.  

MPICH guys, Jim's original email is below. 
==rob

On Mon, Jul 19, 2010 at 04:44:50PM -0600, Jim Edwards wrote:
> Hi All,
> 
> I have created a new repository branch and checked in the beginnings of a
> version of pio which allows the io tasks to be a disjoint set of tasks from
> those used for computation.
> 
> The io_comm and the comp_comm are disjoint and pio_init
> is called with an intercommunicator which spans the two task sets.   The
> compute task set returns while the io task set waits in a call back loop for
> further instructions.
> 
> I have added three new tests in the pio test suite and all of them pass on
> bluefire.   Then I discovered that the mpich  does not support mpi
> intercommunicators.    These are part of the mpi-2 standard and I thought
> that all of the mpi implementations were there by now?  Apparently not.   Is
> there another mpi implementation that we can try on jaguar or edinburgh?
> 
> Currently all of the pio commands are still syncronous calls - that is the
> compute tasks cannot continue until the write has completed, my eventual
> plan is to relax this requirement to see if there is a performance advantage
> - but if AIX-POE is the only environment to support this model I may have to
> rethink the approach.
> 
> If you get a chance please have a look at the implementation in
> https://parallelio.googlecode.com/svn/branches/async_pio1_1_1/
> 
> If enough of you are interested we can schedule a con-call to go over how it
> works and some of the things that still need to be done.
> 
> Jim
> 

-- 
Rob Latham
Mathematics and Computer Science Division
Argonne National Lab, IL USA


More information about the mpich-discuss mailing list