[mpich-discuss] command line ordering of hosts matters?

David_Lowinger at ea.epson.com David_Lowinger at ea.epson.com
Wed Jun 30 15:42:22 CDT 2010


Firewall is turned off on both machines.
There is no error message... the MPI_Bcase simply never completes.  I've 
left it running for 10 minutes, and the second printf ("completed 
MPI_Bcast") never appears when I use the second host ordering below 
("10.0.0.101 1 10.0.0.6 1").
When I run "smpd -status 10.0.0.6" from 10.0.0.101, I see the message 
"smpd running on 10.0.0.6".  When I run "smpd -status 10.0.0.101" from 
10.0.0.6, I see the message "smpd running on 10.0.0.101".




jayesh at mcs.anl.gov 
06/30/2010 10:50 AM
Expire Date: 06/29/2012


To
mpich-discuss at mcs.anl.gov
cc
David_Lowinger at ea.epson.com
Subject
Re: [mpich-discuss] command line ordering of hosts matters?






Hi,
 Do you have a firewall running on any of these machines (If so, can you 
try running your job after turning off the firewall)?
 What is the error message that you get when you run your job ?
 Can you try running "smpd -status REMOTE_MACHINE" from each of the 
machines and let us know the results ("smpd -status 10.0.0.6" from 
10.0.0.101 & "smpd -status 10.0.0.101" from 10.0.0.6)?

Regards,
Jayesh
----- Original Message -----
From: "David Lowinger" <David_Lowinger at ea.epson.com>
To: mpich-discuss at mcs.anl.gov
Sent: Tuesday, June 29, 2010 5:53:07 PM GMT -06:00 US/Canada Central
Subject: [mpich-discuss] command line ordering of hosts matters?



Hi, 
When running a very basic "hello world" app, I've found that the app's 
behavior depends on the order I use for hosts in the command line. For 
example, if I use: 

mpiexec -hosts 2 10.0.0.6 1 10.0.0.101 1 helloworld.exe 

The program executes flawlessly. But, if I use: 

mpiexec -hosts 2 10.0.0.101 1 10.0.0.6 1 helloworld.exe 

then the program never gets past the call to "MPI_Bcast()". Here is my 
code: 

------------------- 

#include "mpi.h" 

#define MPI_FLUSH() fflush(stdout) 

int main( int argc, char* argv[] ) 
{ 
int g_Thread_ID, g_Num_Threads; 
int test = 0; 

/**************************************************\ 
* MPI Initialization * 
\**************************************************/ 
MPI_Init(&argc, &argv); 
MPI_Comm_rank(MPI_COMM_WORLD, &g_Thread_ID); 
MPI_Comm_size(MPI_COMM_WORLD, &g_Num_Threads); 

printf("thread %d: main: About to execute MPI_Bcast\n", g_Thread_ID); 
MPI_FLUSH(); 

// Broadcast integer 
int err = MPI_Bcast(&test, 1, MPI_INT, 0, MPI_COMM_WORLD); 

printf("thread %d: completed MPI_Bcast\n", g_Thread_ID); 
MPI_FLUSH(); 

MPI_Finalize(); 
} 

------------------ 

I am running Windows Vista on both machines. Has anyone seen this before? 
Thanks, 
David 

_______________________________________________
mpich-discuss mailing list
mpich-discuss at mcs.anl.gov
https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20100630/b623dd5e/attachment.htm>


More information about the mpich-discuss mailing list