[mpich-discuss] Howto use MPI_Comm_merge together with MPI_Comm_spawn

Nick Radcliffe nradclif at cray.com
Thu Feb 16 08:56:26 CST 2012


One problem is that the spawned child is not calling MPI_INTERCOMM_MERGE. The child needs to call the merge function in the 'else' part of your 'if (parentcomm == MPI_COMM_NULL)'.

-Nick
________________________________
From: mpich-discuss-bounces at mcs.anl.gov [mpich-discuss-bounces at mcs.anl.gov] on behalf of Umit [umitcanyilmaz at gmail.com]
Sent: Thursday, February 16, 2012 7:06 AM
To: mpich-discuss at mcs.anl.gov
Subject: [mpich-discuss] Howto use MPI_Comm_merge together with MPI_Comm_spawn

Hello All,

Can anyone tell me what is wrong with this simple code:

#define NUM_SPAWNS 4
double timer;
int i;
char str[100];
int main( int argc, char *argv[] )
{
    MPI_Comm parentcomm, intercomm;
    MPI_Comm comm, scomm;
    MPI_Init( &argc, &argv );
    MPI_Comm_get_parent( &parentcomm );
    int np = NUM_SPAWNS;
    int size;
    MPI_Comm_size( MPI_COMM_WORLD , &size );
    if (parentcomm == MPI_COMM_NULL)
    {
       scomm = MPI_COMM_WORLD;
       int errcodes[np];
       MPI_Comm_spawn( "/home/test/Desktop/merge/./a.out", MPI_ARGV_NULL, np, MPI_INFO_NULL, 0, MPI_COMM_WORLD, &intercomm, errcodes );
       MPI_Intercomm_merge( intercomm, 1, &comm );
    }
    else
    {
        printf("I'm the spawned.\n");
    }
    MPI_Finalize();
    return 0;
}

I called MPI_Intercomm_merge outside of if statement but I got the same error.
Spawn is successfull. I have especially tested it. If I try to merge, i got the following error:

test at ubuntu:~/Desktop/merge$ mpirun -np 1 ./a.out
I'm the spawned.
I'm the spawned.
I'm the spawned.
I'm the spawned.
Fatal error in MPI_Intercomm_merge: Other MPI error, error stack:
MPI_Intercomm_merge(288)..........: MPI_Intercomm_merge(comm=0x84000000, high=1, newintracomm=0xbf97ff40) failed
MPI_Intercomm_merge(263)..........:
MPIR_Get_contextid(639)...........:
MPI_Allreduce(773)................: MPI_Allreduce(sbuf=MPI_IN_PLACE, rbuf=0xbf97fd18, count=64, MPI_INT, MPI_BAND, comm=0x84000002) failed
MPIR_Allreduce(289)...............:
MPIC_Sendrecv(161)................:
MPIC_Wait(513)....................:
MPIDI_CH3I_Progress(150)..........:
MPID_nem_mpich2_blocking_recv(948):
MPID_nem_tcp_connpoll(1720).......:
state_commrdy_handler(1556).......:
MPID_nem_tcp_recv_handler(1446)...: socket closed
rank 0 in job 21  ubuntu_38267   caused collective abort of all ranks
  exit status of rank 0: killed by signal 9


Thanks In Advance,


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20120216/d4f6fe6d/attachment-0001.htm>


More information about the mpich-discuss mailing list