[mpich2-dev] Parent terminates when the spawned child terminates

Suraj Prabhakaran suraj.prabhakaran at gmail.com
Thu Dec 16 06:26:50 CST 2010



On 12/15/2010 11:58 PM, Lisandro Dalcin wrote:
> On 15 December 2010 19:11, Suraj Prabhakaran
> <suraj.prabhakaran at gmail.com>  wrote:
>> Thank you guys for all the reply. Thanks Pavan for hinting at the auto
>> cleanup option. This does what I wanted it to do. Just adding to the
>> discussion, I agree with your comments on the portable program's assumption
>> part. The standard states that one program's error may affect the other only
>> if they somehow belong in an intra or inter communicator. And in my case,
>> the parent spawns the child and disconnects the communicator. The child also
>> as soon as init, gets the parent and disconnects from the parent. The sample
>> code is given below.
>>
>> Parent:
>>
>> int main (int argc, char *argv[])
>> {
>>     MPI_Init(&argc,&argv);
>>     MPI_Errhandler_set(MPI_COMM_WORLD, MPI_ERRORS_RETURN);
>>     MPI_Comm child_comm;
>>     MPI_Comm_spawn("./child", MPI_ARGV_NULL, 1, MPI_INFO_NULL, 0,
>> MPI_COMM_SELF,&child_comm, MPI_ERRCODES_IGNORE);
>>     MPI_Errhandler_set(child_comm, MPI_ERRORS_RETURN);
>>     printf("spawned a child\n");
>>     MPI_Comm_disconnect(&child_comm);
>>     printf("Disconnected from the child\n");
>>     sleep(5000);
>>     MPI_Finalize();
>>     return 0;
>> }
>>
>> Child:
>>
>> int main (int argc, char *argv[])
>> {
>>     MPI_Init(&argc,&argv);
>>     MPI_Comm parent, parent1;
>>     MPI_Comm_get_parent(&parent);
>>     MPI_Comm_disconnect(&parent);
>>     if(parent == MPI_COMM_NULL)
>>     printf("Child: Disconnected from the parent, Exiting\n\n");
>>
>>     MPI_Comm_get_parent(&parent1);
>>
>>     if(parent1 != MPI_COMM_NULL)
>>     printf("Child: yes, i got my parent again\n");
>>
>>     exit(1);
>>
>>     MPI_Finalize();
>>     return 0;
>> }
>>
>> If you see here, the child's first message will be displayed while the
>> second printf will not be displayed! Which shows that it cant get the parent
>> anymore. And when a exit() happens abruptly, it shouldn't affect the parent
>> rite (since they now dont have a common communicator, and without any auto
>> clean option) ?
>> Please correct me if I am wrong.
>>
> I believe you are right.  For this program, at child exit() time
> parent and child are disconnected, and I understand the MPI standard
> says that the parent should not be affected. Of course, I'm not a MPI
> lawyer, and perhaps I misunderstood the wording...
That implies that its a bug? I would benefit from a fix for this..


More information about the mpich2-dev mailing list