[mpich-discuss] spawned processes do not shut down
Jonathan Bishop
jbishop.rwc at gmail.com
Tue Nov 1 00:56:27 CDT 2011
Just a follow up.
I discovered that you just need to disconnect from the master side
only, and not on both the worker and master.
Jon
On Fri, Oct 28, 2011 at 8:53 PM, Jonathan Bishop <jbishop.rwc at gmail.com> wrote:
> Thanks!
>
> On Fri, Oct 28, 2011 at 11:17 AM, Darius Buntinas <buntinas at mcs.anl.gov>
> wrote:
>>
>> Hi Jon,
>>
>> I had to pull out my copy of the standard for this one :-). The standard
>> says (Section 10.5.4 on pg 330) that MPI_Finalize is collective over
>> "connected" processes. By the definition of "connected" (in the same
>> section), the master and worker processes in your application are still
>> connected, so the worker process might wait until the master process calls
>> MPI_Finalize (in the case of MPICH, it will wait if the two processes have
>> ever communicated). You can use MPI_Comm_disconnect to disconnect the
>> master and worker before the worker calls MPI_Finalize.
>>
>> I added an MPI_Comm_disconnect on line 37 and line 80 of your program and
>> it looks like it works as you intend.
>>
>> -d
>>
>>
>> On Oct 28, 2011, at 11:18 AM, Jonathan Bishop wrote:
>>
>> > I am using MPI_Comm_spawn to dynamically run workers. However, when the
>> > workers exit they get hung up on MPI_Finalize. Here is a short program which
>> > shows the issue...
>> >
>> > It responds to several commands...
>> >
>> > Do
>> >
>> > start
>> > stop
>> >
>> > and then check how many processes are running - it should be 1, not 2.
>> >
>> > I am using MPICH2 1.4.1-p1.
>> >
>> > Thanks,
>> >
>> > Jon
>> >
>> > #include <sys/types.h>
>> > #include <unistd.h>
>> > #include <iostream>
>> > #include "mpi.h"
>> >
>> > using namespace std;
>> >
>> >
>> > main(int argc, char **argv)
>> > {
>> > MPI_Init(&argc, &argv);
>> > MPI_Comm parent;
>> > MPI_Comm_get_parent(&parent);
>> >
>> > // Master
>> > if (parent == MPI_COMM_NULL) {
>> > cout << getpid() << endl;
>> > MPI_Comm intercom = MPI_COMM_NULL;
>> > while (1) {
>> > cout << "Enter: ";
>> > string s;
>> > cin >> s;
>> > if (s == "start") {
>> > if (intercom != MPI_COMM_NULL) {
>> > cout << "already started" << endl;
>> > continue;
>> > }
>> > MPI_Comm_spawn(argv[0], MPI_ARGV_NULL, 1, MPI_INFO_NULL, 0,
>> > MPI_COMM_SELF, &intercom, MPI_ERRCODES_IGNORE);
>> > continue;
>> > }
>> > if (s == "stop") {
>> > if (intercom == MPI_COMM_NULL) {
>> > cout << "worker not running" << endl;
>> > continue;
>> > }
>> > MPI_Send(const_cast<char*>(s.c_str()), s.size(), MPI_CHAR, 0, 0,
>> > intercom);
>> > intercom = MPI_COMM_NULL;
>> > // MPI_Finalize(); // This will allow the workers to die, but then I
>> > can not restart them.
>> > continue;
>> > }
>> > if (s == "exit") {
>> > if (intercom != MPI_COMM_NULL) {
>> > cout << "need to stop before exit" << endl;
>> > continue;
>> > }
>> > break;
>> > }
>> > if (intercom == MPI_COMM_NULL) {
>> > cout << "need to start" << endl;
>> > continue;
>> > }
>> > MPI_Send(const_cast<char*>(s.c_str()), s.size(), MPI_CHAR, 0, 0,
>> > intercom);
>> > char buf[1000];
>> > MPI_Status status;
>> > MPI_Recv(buf, 1000, MPI_CHAR, MPI_ANY_SOURCE, MPI_ANY_TAG,
>> > intercom, &status);
>> > int count;
>> > MPI_Get_count(&status, MPI_CHAR, &count);
>> > buf[count] = 0;
>> > string t = buf;
>> > cout << "worker returned " << t << endl;
>> > }
>> > }
>> >
>> > // Worker
>> > if (parent != MPI_COMM_NULL) {
>> > while (1) {
>> > char buf[1000];
>> > MPI_Status status;
>> > MPI_Recv(buf, 1000, MPI_CHAR, MPI_ANY_SOURCE, MPI_ANY_TAG, parent,
>> > &status);
>> > int count;
>> > MPI_Get_count(&status, MPI_CHAR, &count);
>> > buf[count] = 0;
>> > string s = buf;
>> > if (s == "stop") {
>> > cout << "worker stopping" << endl;
>> > break;
>> > }
>> > MPI_Send(const_cast<char*>(s.c_str()), s.size(), MPI_CHAR, 0, 0,
>> > parent);
>> > }
>> > }
>> >
>> > MPI_Finalize();
>> > }
>> >
>> > _______________________________________________
>> > mpich-discuss mailing list mpich-discuss at mcs.anl.gov
>> > To manage subscription options or unsubscribe:
>> > https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
>>
>> _______________________________________________
>> mpich-discuss mailing list mpich-discuss at mcs.anl.gov
>> To manage subscription options or unsubscribe:
>> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
>
>
More information about the mpich-discuss
mailing list