[mpich-discuss] MPID_nem_tcp_connpoll(1826): Communication error with rank 0: Connection refused

Rajeev Thakur thakur at mcs.anl.gov
Sat Oct 22 15:30:26 CDT 2011


Make sure the 5 machines can communicate with each other, i.e., there is no firewall preventing connections.

Rajeev

On Oct 22, 2011, at 12:36 PM, Miguel Angel Fernández wrote:

> Hello everybody
> 
> I'm trying to fix a problem that appear when I execute one of the mpich2 program examples.
> As you can see, if I execute a normal command there are no problems. The cluster work properly.
> 
> mpi at mpi0:~$ mpiexec -f ./mpich2-install/machinefile -n 5 hostname
> mpi0
> mpi2
> mpi3
> mpi1
> mpi4
> mpi at mpi0:~$
> 
> but when I try to execute the program, the results are something like this
> 
> mpi at mpi0:~$ mpiexec -f ./mpich2-install/machinefile -n 5 /home/mpi/mpich2-install/workspace/Prueba/Debug/Prueba
> Hello MPI World the original.
> Hello MPI World the original.
> Hello MPI World the original.
> Hello MPI World the original.
> Hello MPI World the original.
> From process 0: Num processes: 5
> Fatal error in MPI_Send: Other MPI error, error stack:
> MPI_Send(173)..............: MPI_Send(buf=0xbfcbe268, count=26, MPI_CHAR, dest=0, tag=0, MPI_COMM_WORLD) failed
> MPID_nem_tcp_connpoll(1826): Communication error with rank 0: Connection refused
> Fatal error in MPI_Send: Other MPI error, error stack:
> MPI_Send(173)..............: MPI_Send(buf=0xbfb32ca8, count=26, MPI_CHAR, dest=0, tag=0, MPI_COMM_WORLD) failed
> MPID_nem_tcp_connpoll(1826): Communication error with rank 0: Connection refused
> Fatal error in MPI_Send: Other MPI error, error stack:
> MPI_Send(173)..............: MPI_Send(buf=0xbfa49e98, count=26, MPI_CHAR, dest=0, tag=0, MPI_COMM_WORLD) failed
> MPID_nem_tcp_connpoll(1826): Communication error with rank 0: Connection refused
> Fatal error in MPI_Send: Other MPI error, error stack:
> MPI_Send(173)..............: MPI_Send(buf=0xbfa57538, count=26, MPI_CHAR, dest=0, tag=0, MPI_COMM_WORLD) failed
> MPID_nem_tcp_connpoll(1826): Communication error with rank 0: Connection refused
> 
> Do you have any idea what can be the problem?
> 
> Thank you in advance
> Miguel Angel
> 
> _______________________________________________
> mpich-discuss mailing list     mpich-discuss at mcs.anl.gov
> To manage subscription options or unsubscribe:
> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss



More information about the mpich-discuss mailing list