[mpich-discuss] MPID_nem_tcp_connpoll(1826): Communication error with rank 0: Connection refused

Miguel Angel Fernández mafga74 at hotmail.com
Sat Oct 22 15:46:44 CDT 2011


Hi Rajeev

Thanks for your response but there is no any firewall among them.

mpi0 is Ubuntu 
mpi1 to mpi4 are Debian (minimun distribution without GUI)

I am doing the conection among them with ssh and it is well configured and properly working, I tested it.
I am thinking,...are you using a socket conettion to comunicate the diferent processes? This could be the problem in the Ubuntu one (mpi0).
If this is the case, can you tell me the exactly socket you are using?

Thanks
Miguel Angel


> From: thakur at mcs.anl.gov
> Date: Sat, 22 Oct 2011 15:30:26 -0500
> To: mpich-discuss at mcs.anl.gov
> Subject: Re: [mpich-discuss] MPID_nem_tcp_connpoll(1826): Communication	error with rank 0: Connection refused
> 
> Make sure the 5 machines can communicate with each other, i.e., there is no firewall preventing connections.
> 
> Rajeev
> 
> On Oct 22, 2011, at 12:36 PM, Miguel Angel Fernández wrote:
> 
> > Hello everybody
> > 
> > I'm trying to fix a problem that appear when I execute one of the mpich2 program examples.
> > As you can see, if I execute a normal command there are no problems. The cluster work properly.
> > 
> > mpi at mpi0:~$ mpiexec -f ./mpich2-install/machinefile -n 5 hostname
> > mpi0
> > mpi2
> > mpi3
> > mpi1
> > mpi4
> > mpi at mpi0:~$
> > 
> > but when I try to execute the program, the results are something like this
> > 
> > mpi at mpi0:~$ mpiexec -f ./mpich2-install/machinefile -n 5 /home/mpi/mpich2-install/workspace/Prueba/Debug/Prueba
> > Hello MPI World the original.
> > Hello MPI World the original.
> > Hello MPI World the original.
> > Hello MPI World the original.
> > Hello MPI World the original.
> > From process 0: Num processes: 5
> > Fatal error in MPI_Send: Other MPI error, error stack:
> > MPI_Send(173)..............: MPI_Send(buf=0xbfcbe268, count=26, MPI_CHAR, dest=0, tag=0, MPI_COMM_WORLD) failed
> > MPID_nem_tcp_connpoll(1826): Communication error with rank 0: Connection refused
> > Fatal error in MPI_Send: Other MPI error, error stack:
> > MPI_Send(173)..............: MPI_Send(buf=0xbfb32ca8, count=26, MPI_CHAR, dest=0, tag=0, MPI_COMM_WORLD) failed
> > MPID_nem_tcp_connpoll(1826): Communication error with rank 0: Connection refused
> > Fatal error in MPI_Send: Other MPI error, error stack:
> > MPI_Send(173)..............: MPI_Send(buf=0xbfa49e98, count=26, MPI_CHAR, dest=0, tag=0, MPI_COMM_WORLD) failed
> > MPID_nem_tcp_connpoll(1826): Communication error with rank 0: Connection refused
> > Fatal error in MPI_Send: Other MPI error, error stack:
> > MPI_Send(173)..............: MPI_Send(buf=0xbfa57538, count=26, MPI_CHAR, dest=0, tag=0, MPI_COMM_WORLD) failed
> > MPID_nem_tcp_connpoll(1826): Communication error with rank 0: Connection refused
> > 
> > Do you have any idea what can be the problem?
> > 
> > Thank you in advance
> > Miguel Angel
> > 
> > _______________________________________________
> > mpich-discuss mailing list     mpich-discuss at mcs.anl.gov
> > To manage subscription options or unsubscribe:
> > https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
> 
> _______________________________________________
> mpich-discuss mailing list     mpich-discuss at mcs.anl.gov
> To manage subscription options or unsubscribe:
> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
 		 	   		  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20111022/e9a1565b/attachment.htm>


More information about the mpich-discuss mailing list