[mpich-discuss] MPI_Comm_spawn_multiple() fails

Jayesh Krishna jayesh at mcs.anl.gov
Fri Apr 2 15:58:38 CDT 2010


Hi,
 I tried spawn() with the host info set to "localhost" and it worked fine for me (However I spawned using a single program. Does that work for you ?). 

Regards,
Jayesh
----- Original Message -----
From: "Alex Bakkal" <Abakkal at LOANPERFORMANCE.com>
To: jayesh at mcs.anl.gov, mpich-discuss at mcs.anl.gov
Sent: Friday, April 2, 2010 3:24:21 PM GMT -06:00 US/Canada Central
Subject: RE: [mpich-discuss] MPI_Comm_spawn_multiple() fails

Hi Jayesh,

Thank you for response. My test app, as well as the examples that you
have referred me to, works fine locally until I attempt to set "host"
key in MPI_Info. Unfortunately, I need that key in order to spawn
remotely. That is my ultimate goal.

Thank you anyway.

Alex 

-----Original Message-----
From: jayesh at mcs.anl.gov [mailto:jayesh at mcs.anl.gov] 
Sent: Friday, April 02, 2010 8:07 AM
To: mpich-discuss at mcs.anl.gov
Cc: Bakkal, Alex
Subject: Re: [mpich-discuss] MPI_Comm_spawn_multiple() fails

Hi,
 Couple of suggestions,

# Does it work if you use the same command (same program) for all
instances of the spawned procs ?
# Did you try setting the "path" infos for the commands ?

 You can find some examples at
https://svn.mcs.anl.gov/repos/mpi/mpich2/trunk/test/mpi/spawn (Look at
spawnminfo1.c) . Let us know the results.

Regards,
Jayesh
----- Original Message -----
From: "Alex Bakkal" <Abakkal at LOANPERFORMANCE.com>
To: mpich-discuss at mcs.anl.gov
Sent: Thursday, April 1, 2010 4:58:07 PM GMT -06:00 US/Canada Central
Subject: [mpich-discuss] MPI_Comm_spawn_multiple() fails



Hello, 

I have installed MPICH2 v.1.2.1 on Windows XP Pro machines and was
trying to run the following test app on "mt" channel: 



#define NHOST 2 

int main( int argc, char * argv[]) { 

int supported; 

int rank, size; 

MPI_Comm parent, intercomm; 



MPI_Init_thread(&argc, &argv, MPI_THREAD_MULTIPLE, &supported); 

if (supported != MPI_THREAD_MULTIPLE){ 

printf( "The library does not support MPI_THREAD_MULTIPLE\n" ); 

exit(-1); 

} 

MPI_Comm_get_parent(&parent); 

if (parent == MPI_COMM_NULL){ 

int i; 

int nproc[NHOST] = {1 , 1 }; 

char * progs[NHOST] = { "c:\\mpi_spawn" , "c:\\tm\\mpi_spawn" }; 



MPI_Info infos[NHOST]; 



for (i=0; i < NHOST; i++) { 

MPI_Info_create(&infos[i]); 

MPI_Info_set(infos[i], "host" , "localhost" ); 

} 

MPI_Comm_spawn_multiple(NHOST, 

progs, MPI_ARGVS_NULL, nproc, infos, 

0, MPI_COMM_WORLD, &intercomm, MPI_ERRCODES_IGNORE); 

for (i=0; i < NHOST; i++) { 

MPI_Info_free(&infos[i]); 

} 

} 

else { 

intercomm = parent; 

} 

MPI_Comm_rank(intercomm, &rank); 

MPI_Comm_size(intercomm, &size); 

printf( "[%d/%d] Hello world\n" , rank, size); fflush(stdout); 

MPI_Comm_free(&intercomm); 

MPI_Finalize(); 

return 0; 

} 

The app hangs. I can see three (as expected) instances of the program in
the Task Manager but MPI_Comm_spawn_multiple() never returns control to
my app. 

I have reduced the number of spawned hosts to one and it ran
successfully. Also, it spawned successfully two hosts with the following
line commented out: 

//MPI_Info_set(infos[i], "host" , "localhost" ); 

I have tried it on two machines with the same result. 



Also, I have tried to spawn just one instance but on the remote host and
got the following error messages: 

ERROR:unable to read the cmd header on the pmi context, Error = -1 

ERROR:Error posting ready, An existing connection was forcibly closed by
the remote host. 

I am not sure whether that is related to the first issue? 



Any insight would be greately appreciated. 

Thank you. 

Alex 

************************************************************************
******************
This message may contain confidential or proprietary information
intended only for the use of the
addressee(s) named above or may contain information that is legally
privileged. If you are not the intended addressee, or the person
responsible for delivering it to the intended addressee, you are hereby
notified that reading, disseminating, distributing or copying this
message is strictly prohibited. If you have received this message by
mistake, please immediately notify us by replying to the message and
delete the original message and any copies immediately thereafter. 

Thank you. 
************************************************************************
******************
FACLD
_______________________________________________
mpich-discuss mailing list
mpich-discuss at mcs.anl.gov
https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss


More information about the mpich-discuss mailing list