[mpich2-dev] Problem making a TCP Connection

Cody R. Brown cody at cs.ubc.ca
Tue Nov 29 18:04:33 CST 2011


Hello;

I am trying to install MPICH2 on our department machines. I can run a
simple helloworld example (no mpi_send). However when I run an MPI program
which requires an MPI_Send (or other TCP connection), it errors out with
the following. The example is a simple helloworld example using an MPI_Send:

cody$ mpiexec -n 2 -hosts host1,host2 ./networld
Hello world (Rank: 0 / Host: host1)
Hello world (Rank: 1 / Host: host2)
Fatal error in MPI_Send: Other MPI error, error stack:
MPI_Send(173)..............: MPI_Send(buf=0x7fff26b4cb80, count=50,
MPI_CHARACTER, dest=0, tag=1, MPI_COMM_WORLD) failed
MPID_nem_tcp_connpoll(1826): Communication error with rank 0: Connection
refused


We have determined there is no firewall between the machines, and
passwordless ssh is set up, ect. I can telnet into the hydra damon from the
2nd host. Interestingly, I can install OpenMPI, and it works fine.
It runs fine on a single host, (even if I run it purely on the remote host2
from the local host1 -- it works). Just when we are using 2+ hosts so that
it needs to make the TCP connection.

For some reason MPICH2 can't seem to get the TCP connection info to make
the TCP connect between the machines.

I not too sure if there is much info you guys can give. I was just curious
if you have seen or heard of this before. The system is an "openSUSE 11.4
(x86_64)". The MPICH2 version is 1.4.1p1.

--
Cody R. Brown
  UBC Department of Computer Science
  201-2366 Main Mall, Vancouver, BC, V6T 1Z4
  Office: ICCS x409
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich2-dev/attachments/20111129/6dff3766/attachment.htm>


More information about the mpich2-dev mailing list