[MPICH] Changing the comm size at runtime

Tue Apr 10 11:14:34 CDT 2007

I tried to "fold" the accepted intercomm objects to a large single
intracomm object as you suggested, but MPI_Intercomm_create aborts with
"Invalid buffer pointer". The code for a master "A" awaiting two slaves
"B" and "C" to connect looks like this:

------------------
MASTER (process A)

  // accept all slaves
  MPI::Intercomm inter[num_slaves];
  for(int i = 0; i < num_slaves; i++)
    inter[i] = MPI::COMM_WORLD.Accept(port,MPI::INFO_NULL,0);

  // collective merging over A and B
  MPI::Intracomm AB_intra = inter[0].Merge(false);

  // collective intercomm creation over A, B and C
  MPI::Intercomm AB_C_inter = AB_intra.Create_intercomm(0,inter[1],0,0);

-------------------
SLAVE 1 (process B)

  // connect to master
  MPI::Intercomm AB_inter = MPI::COMM_WORLD.Connect(port,MPI::INFO_NULL,0);

  // collective merge over A and B
  MPI::Intracomm AB_intra = AB_inter.Merge(true);

  // collective intercomm creation over A, B and C
  MPI::Intercomm AB_C_inter = AB_intra.Create_intercomm(0,AB_inter,0,0);

-------------------
SLAVE 2 (process C)

  // connect to master
  MPI::Intercomm AC_inter = MPI::COMM_WORLD.Connect(port,MPI::INFO_NULL,0);

  // collective intercomm creation over A, B and C
  MPI::Intercomm AB_C_inter =
MPI::COMM_WORLD.Create_intercomm(0,AC_inter,0,0);
------------------

The MPI-Forum's guide says:

"The function MPI_INTERCOMM_CREATE can be used to create an
inter-communicator from two existing intra-communicators"

I don't know if this excludes inter-communicators like those I am
passing. Maybe the leader rank args are wrong but I could not find a
solution.

Thanks
Patrick

Rajeev Thakur wrote:
> If the slaves can connect to the master with a single collective
> MPI_Comm_connect (i.e. the communicator includes all slaves), then it is
> easy: A single MPI_Intercomm_merge will create the giant intracommunicator.
> If the slaves are connecting one by one, it is much more difficult. Here is
> an example of how to do it:
> 
> Let's say Process A is the master and processes B, C, and D are slaves. B,
> C, D individually connect to A, resulting in 3 intercommunicators: AB_inter,
> AC_inter, and AD_inter. To merge them all into a single intracommunicator:
> 
> * begin by doing an MPI_Intercomm_merge on AB_inter, resulting in an
> intracommunicator AB_intra.
> 
> * then create an intercommunicator between AB on one side and C on the other
> by using MPI_Intercomm_create. Pass AB_intra as the local_comm on A and B,
> MPI_COMM_WORLD as the intracomm on C, and AC_inter as the peer_comm. This
> results in the intercommunicator AB_C_inter. 
> 
> * then call MPI_Intercomm_merge on it to create the intracommunicator
> ABC_intra. 
> 
> * then call MPI_Intercomm_create to create an intercommunicator between ABC
> and D just as you did with AB and C above. 
> 
> * then call MPI_Intercomm_merge to get a single intracommunicator containing
> all.
> 
> Rajeev 
> 
> 
>> -----Original Message-----
>> From: owner-mpich-discuss at mcs.anl.gov 
>> [mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of Patrick Gräbel
>> Sent: Sunday, March 18, 2007 2:02 PM
>> To: mpich-discuss at mcs.anl.gov
>> Subject: Re: [MPICH] Changing the comm size at runtime
>>
>> At the moment the master iterates over a set of Intracomm objects that
>> are created by calling MPI_Intercomm_Merge on the accepted Intercomm
>> objects in order to communicate. This works but...
>>
>> ...is there a way to form a larger Intracomm (or Intercomm) 
>> object that
>> contains all slaves at once? This would make the programm more elegant
>> and smaller.
>>
>> Greetings
>> Patrick
>>
>> Rajeev Thakur wrote:
>>> I should point out that you can also create an 
>> intracommunicator from an
>>> intercommunicator by using MPI_Intercomm_merge and then use 
>> the regular
>>> intracommunicator collectives if that's what you need.
>>>
>>> Rajeev
>>
> 
>