<div>Hello;</div><div><br></div><div>I am trying to install MPICH2 on our department machines. I can run a simple helloworld example (no mpi_send). However when I run an MPI program which requires an MPI_Send (or other TCP connection), it errors out with the following. The example is a simple helloworld example using an MPI_Send:</div>
<div><br></div><div><div>cody$ mpiexec -n 2 -hosts host1,host2 ./networld</div><div>Hello world (Rank: 0 / Host: host1)</div><div>Hello world (Rank: 1 / Host: host2)</div></div><div>Fatal error in MPI_Send: Other MPI error, error stack:</div>
<div>MPI_Send(173)..............: MPI_Send(buf=0x7fff26b4cb80, count=50, MPI_CHARACTER, dest=0, tag=1, MPI_COMM_WORLD) failed</div><div>MPID_nem_tcp_connpoll(1826): Communication error with rank 0: Connection refused</div>
<div><br></div><div><br></div><div>We have determined there is no firewall between the machines, and passwordless ssh is set up, ect. I can telnet into the hydra damon from the 2nd host. Interestingly, I can install OpenMPI, and it works fine.</div>
<div>It runs fine on a single host, (even if I run it purely on the remote host2 from the local host1 -- it works). Just when we are using 2+ hosts so that it needs to make the TCP connection.</div><div><br></div><div>For some reason MPICH2 can't seem to get the TCP connection info to make the TCP connect between the machines. </div>
<div><br></div><div>I not too sure if there is much info you guys can give. I was just curious if you have seen or heard of this before. The system is an "openSUSE 11.4 (x86_64)". The MPICH2 version is 1.4.1p1.</div>
<div><br></div><div>--</div>Cody R. Brown<br> UBC Department of Computer Science<br> 201-2366 Main Mall, Vancouver, BC, V6T 1Z4<br> Office: ICCS x409<br>