[mpich-discuss] networking problems with mpich2-1.3

Saurav Pathak saurav at sas.upenn.edu
Thu Nov 11 02:37:25 CST 2010


Hi,

I am trying to set up two computers for running MPI.  I have compiled 
mpich2-1.3 from source, and have run the following without any incident 
on both machines (Ubuntu 9.04):

mpiexec -n 2 ./cpi

But when I try to run it on two computers (comp1 and comp2), then I run 
into the following problems.

On comp1 when I run "mpiexec -f hosts -n 2 ./cpi", I get the following 
error.
----
[proxy:0:1 at comp2] HYDU_sock_connect 
(/home/saurav/local/src/mpich2-1.3/src/pm/hydra/utils/sock/sock.c:151): 
connect error (Connection timed out)
[proxy:0:1 at comp2] main 
(/home/saurav/local/src/mpich2-1.3/src/pm/hydra/pm/pmiserv/pmip.c:204): 
unable to connect to server comp1 at port 38203 (check for firewalls!)
----

I have checked for firewalls via "sudo  /sbin/iptables -L" on both 
computers, and there are no firewalls.

When I execute "mpiexec -f hosts -n 2 ./cpi" on comp2, I get the 
following output:
----
Process 1 of 2 is on comp2
Process 0 of 2 is on comp1
Fatal error in PMPI_Bcast: Other MPI error, error stack:
PMPI_Bcast(1306)..................: MPI_Bcast(buf=0x7ffff6b7465c, 
count=1, MPI_INT, root=0, MPI_COMM_WORLD) failed
MPIR_Bcast_impl(1150).............:
MPIR_Bcast_intra(1021)............:
MPIR_Bcast_binomial(187)..........:
MPIC_Send(66).....................:
MPIC_Wait(528)....................:
MPIDI_CH3I_Progress(333)..........:
MPID_nem_mpich2_blocking_recv(906):
MPID_nem_tcp_connpoll(1861).......: Communication error with rank 1:
APPLICATION TERMINATED WITH THE EXIT STRING: Hangup (signal 1)
----

It looks like a networking issue, but I can't figure out what it is.  I 
can ssh and rsh from one machine to the other without the need for a 
password and run commands (e.g, I can run "ssh comp2 hostname" on comp1 
and vice versa).

I seem to be at a dead end.  Any help on this issue will be greatlt 
appreciated.

Saurav




More information about the mpich-discuss mailing list