[mpich-discuss] MPI_Comm_Spawn question

Dave Goodell goodell at mcs.anl.gov
Thu Jun 30 11:21:35 CDT 2011


You have to disconnect from *all* communicators.  This includes the ParentComm in the children and the ChildComm in the parent.

Search your copy of the MPI standard for a formal definition of "connected".  The short version of it is that connectivity is a transitive relation between processes, and that all communicators that connect sets of processes must be disconnected before those sets may independently finalize.

-Dave

On Jun 30, 2011, at 10:02 AM CDT, Eric Hui wrote:

> Hi Dave,
> 
> Thanks for your explanation.
> 
> I was able to use MPI_Intercomm_merge to create the intracommunicator.
> 
> Is it possible to close the child process one by one without affecting the other processes afterwards?
> 
> I tried the following and the child processes seem to wait for all processes to get to the same point even when I call MPI_Comm_disconnect before MPI_Finalize.
> 
> //main function
> 
> NumMPIHelpers = 3;
> 
> MPI_Comm_get_parent (&ParentComm);
> 
> if (ParentComm == MPI_COMM_NULL)
> {
>   MPI_Comm_spawn("myapp.exe", argv, NumMPIHelpers, MPI_INFO_NULL, 0, MPI_COMM_SELF, &ChildComm, MPI_ERRCODES_IGNORE);
> }
> 
> MPI_Intercomm_merge (ParentComm == MPI_COMM_NULL? ChildComm : ParentComm, 0, &AllProcesses);
> 
> //do their work...
> 
> MPI_Barrier (AllProcesses);
> 
> 
> 
> 
> //close function
> MPI_Comm_disconnect (&AllProcesses);
> 
> MPI_Finalize ();
> 
> Regards,
> Eric
> 
> -----Original Message-----
> From: Dave Goodell [mailto:goodell at mcs.anl.gov] 
> Sent: Wednesday, June 29, 2011 2:34 PM
> To: Eric Hui
> Cc: mpich-discuss at mcs.anl.gov
> Subject: Re: MPI_Comm_Spawn question
> 
> It does include both the parent processes and the child processes.  But it is still an *intercommunicator*.  From your earlier email it sounded like you wanted an *intracommunicator* instead.  To obtain that you will need to perform either an MPI_Intercomm_merge or some set of MPI_Group operations.
> 
> MPI_Comm_get_parent will return MPI_COMM_NULL in the parent process (assuming it does not itself have a separate parent), and in the child processes it will return the intercommunicator for communication with the parent.  That is, it is the "other half" of the intercomm that is returned by MPI_Comm_spawn in the parent processes.
> 
> -Dave
> 
> On Jun 29, 2011, at 12:53 PM CDT, Eric Hui wrote:
> 
>> Hi Dave,
>> 
>> I was reading section 10.3.2 in the MPI 2.2 spec and here is what it says for MPI_Comm_Spawn:
>> 
>> The intercommunicator returned by MPI_COMM_SPAWN contains the parent processes in the local group and the child processes in the remote group. The ordering of processes in the local and remote groups is the same as the ordering of the group of the comm in the parents and of MPI_COMM_WORLD of the children, respectively. This intercommunicator can be obtained in the children through the function MPI_COMM_GET_PARENT.
>> 
>> From the above paragraph, it sounds to me like this new intercommunicator should include both the parent and the child processes.  Did I misinterpret the spec?
>> 
>> When I call MPI_Comm_get_parent in the child, it only include the children processes, but not the parent.  I am not sure how I can get the parent included if MPI_Comm_get_parent does not return it.
>> 
>> Actually the example code that I am using is based on the example I found here in the MPI forum here:
>> http://www.mpi-forum.org/docs/mpi22-report/node210.htm#Node210
>> 
>> Regards,
>> Eric
>> 
>> From: Eric Hui 
>> Sent: Wednesday, June 29, 2011 11:27 AM
>> To: mpich-discuss at mcs.anl.gov
>> Subject: MPI_Comm_Spawn question
>> 
>> I started my MPI program with one process only and tried to use MPI_Comm_Spawn to launch three more copies like this:
>> 
>> //main
>>         MPI_Comm ParentComm;
>>         MPI_Comm InterComm;
>> 
>>         MPI_Comm_get_parent (&ParentComm);
>> 
>>         if (ParentComm == MPI_COMM_NULL)
>>         {
>>            MPI_Comm_spawn("myapp.exe", argv, NumMPIHelpers, MPI_INFO_NULL, 0, MPI_COMM_SELF, &InterComm, MPI_ERRCODES_IGNORE);
>> 
>>            ShowRank (InterComm);  //*ß this shows size = 1, rank = 0
>>         }
>>         else
>>         {
>>            ShowRank (ParentComm);  //* this shows size = 3, rank = 0, 1 or 2 for the childs
>>         }
>> 
>> //showrank function
>> void ShowRank (MPI_Comm Com)
>> {
>>   MPI_Comm_rank (Com, &MPIRank);      /* get current process id */
>>   MPI_Comm_size (Com, &MPISize);         /* get number of processes */   
>> 
>>   DWORD Pid = GetCurrentProcessId();
>> 
>>   CString Msg;
>> 
>>   Msg.Format ("MPI_Comm = %ld, Process ID = %ld, Size = %ld, Rank = %ld", Com, Pid, MPISize, MPIRank);
>> 
>>   AddMsg (Msg);
>> }
>> 
>> I am trying to join the parents and the children together into one group.
>> 
>> Do I need to call MPI_Intercomm_merge?  I thought the "InterComm" is already supposed to have size = 4 with rank = 0, 1, 2 and 3.
>> 
>> Thanks,
>> Eric
> 



More information about the mpich-discuss mailing list