[mpich-discuss] Cpi.exe hangs

Jayesh Krishna jayesh at mcs.anl.gov
Tue Jul 26 14:11:26 CDT 2011


Hi,
 Are you running your job on nodes with the same arch/data_model (i.e., Both nodes are 32-bit OR both nodes are 64-bit) ? MPICH2 currently does not support heterogeneous nodes.
 If so, please provide more details on the nodes that you are running your job on.

-Jayesh

----- Original Message -----
From: "shridhar mohan" <shri.emi at gmail.com>
To: "Jayesh Krishna" <jayesh at mcs.anl.gov>
Sent: Tuesday, July 26, 2011 2:03:46 PM
Subject: Re: Cpi.exe hangs

HI, 

I am using MPICH2 Ver-1.4 and Windows Enterprise edition version 6.1. 
This the first time I am trying to set up MPICH2. 
I have the same problem in both 32 and 64 bit MPICH2 ver1.4 

Yes the fire wall is turned down in all the Nodes. 
The behavior is the same regardless of which host I launch 
the program. 

I have attached the screen shot showing the output. 
http://i51.tinypic.com/2cgkh1t.jpg 

First, it shows the result when I run Cpi.exe in one host 
Second, it shows the result when I run it in 2 hosts. 
But, when try to quit the second time it hangs. 

-shridhar 


On Tue, Jul 26, 2011 at 2:40 PM, Jayesh Krishna < jayesh at mcs.anl.gov > wrote: 


Hi, 

# Which version of MPICH2 are you using ? 
# Which version of Windows are you using ? 
# Is this the first time you are running MPICH2 (Have you been able to run a previous version of MPICH2)? 
# I am assuming that you have turned off the firewalls on both the nodes. If not, please do so and re-run your job. 
# Can you copy-paste the command and the output in your email ? 
# Does the behavior change if you launch the program from the second node (instead of the first node) ? 


-Jayesh 

----- Original Message ----- 
From: "shridhar mohan" < shri.emi at gmail.com > 
To: "Jayesh Krishna" < jayesh at mcs.anl.gov > 



Sent: Tuesday, July 26, 2011 1:28:08 PM 
Subject: Cpi.exe hangs 

Hi, 

Yes i am able to ping from each node to every 
other node. I tried running cpi.exe like you said. 

The example works fine, Untill I try to quit. 
But when i try to quit it hangs. 

-shridhar 


On Tue, Jul 26, 2011 at 2:02 PM, Jayesh Krishna < jayesh at mcs.anl.gov > wrote: 


Hi, 
Are you able to ping from each node to every other node ? (Note that it is not enough to be able to ping all nodes from the node where you launch your job). I would recommend using two hosts for your initial tests. Ping from machine A to machine B AND from machine B to machine A. If that works try running cpi.exe on the two nodes, mpiexec -n 2 -machinefile mf.txt cpi.exe). 

-Jayesh 


----- Original Message ----- 
From: "shridhar mohan" < shri.emi at gmail.com > 
To: "Jayesh Krishna" < jayesh at mcs.anl.gov > 



Sent: Tuesday, July 26, 2011 12:20:32 PM 
Subject: Re: MPI_Finalize hangs 

Hi, 

I can run the example CPI, on a single host. 
When I try to run it on multiple hosts 
it still works. But, it does not terminate 
properly. It hags when i try to quit the CPi example. 

-shridhar 


On Mon, Jul 25, 2011 at 9:34 PM, Jayesh Krishna < jayesh at mcs.anl.gov > wrote: 


Hi, 
Can you run the example program provided with MPICH2 (C:\Progra~1\MPICH2\examples\cpi.exe)? 

(PS: The MPI processes try to connect to each other to exchange data when MPI_Send() is called. So your problem could still be a network setup issue.) 
Regards, 
Jayesh 




----- Original Message ----- 
From: "shridhar mohan" < shri.emi at gmail.com > 
To: "Jayesh Krishna" < jayesh at mcs.anl.gov > 
Cc: mpich-discuss at mcs.anl.gov 
Sent: Monday, July 25, 2011 6:25:24 PM 
Subject: MPI_Finalize hangs 

Hi, 

I found the problem. 
It was not the firewall. 

The process hangs at MPI_Finalize. 
When i abort the process I Get correct the results. 

I think the MPI_Send causes this problem. 
I tried using non-blocking send but ran in to the same issue. 
I have the same issue when i try running the CPi.c example 
on multiple hosts. 

The following is the code. 
I don't see anything wrong with it. 
What do you think causes this issue. 

Program 

#include < mpi .h> 
#include <iostream> 
#include <stdio.h> /* printf and BUFSIZ defined there */ 
#include <stdlib.h> /* exit defined there */ 


using namespace std; 

int _tmain(int argc, char * argv[]) 
{ 
int numtasks, rank, dest, source, rc, tag=1; 
char inmsg, outmsg='F'; 
MPI_Status Stat; 




MPI_Init(&argc,&argv); 
MPI_Comm_size(MPI_COMM_WORLD, &numtasks); 
MPI_Comm_rank(MPI_COMM_WORLD, &rank); 

if (rank == 0) { 
dest = 1; 
printf(" Process %d: testing \n",rank); 

rc = MPI_Send(&outmsg, 1, MPI_CHAR, dest, tag, MPI_COMM_WORLD); 
printf(" Process %d: sent \n",rank); 

} 

else if (rank == 1) { 
source = 0; 
printf(" Process %d: testing \n",rank); 
rc = MPI_Recv(&inmsg, 1, MPI_CHAR, source, tag, MPI_COMM_WORLD,&Stat); 

printf("Process %d: Received %c char(s) from task %d with tag %d \n", 
rank, inmsg, Stat.MPI_SOURCE, Stat.MPI_TAG); 


} 

MPI_Finalize(); 
return 0; 
} 

-Regards 
-shridhar 







More information about the mpich-discuss mailing list