[mpich-discuss] MPID_nem_tcp_connpoll(1826): Communication error with rank 0: Connection refused

Pavan Balaji balaji at mcs.anl.gov
Thu Nov 3 18:40:24 CDT 2011


Hi,

You never responded to the previous email I sent:

Is every machine able to connect to every other machine (not just mpi0
to every other machine)?

Btw, Hydra doesn't need any additional password like mpd, but you need 
to make sure that you can ssh to every machine and that every machine 
can connect to every other machine.

  -- Pavan

On 11/03/2011 02:53 AM, Miguel Angel Fernández wrote:
> Hello again
>
> I'm still trying to fix the problem I told you days ago.
> I configured all cluster machines for executing mpiexec as root on any
> cases and the problem is the same.
> Is it necesary to configure a password for hydra as I had to do for mpd?
>
> Thank you in advance
> Miguel Ángel
>
>  > From: thakur at mcs.anl.gov
>  > Date: Sat, 22 Oct 2011 15:30:26 -0500
>  > To: mpich-discuss at mcs.anl.gov
>  > Subject: Re: [mpich-discuss] MPID_nem_tcp_connpoll(1826):
> Communication error with rank 0: Connection refused
>  >
>  > Make sure the 5 machines can communicate with each other, i.e., there
> is no firewall preventing connections.
>  >
>  > Rajeev
>  >
>  > On Oct 22, 2011, at 12:36 PM, Miguel Angel Fernández wrote:
>  >
>  > > Hello everybody
>  > >
>  > > I'm trying to fix a problem that appear when I execute one of the
> mpich2 program examples.
>  > > As you can see, if I execute a normal command there are no
> problems. The cluster work properly.
>  > >
>  > > mpi at mpi0:~$ mpiexec -f ./mpich2-install/machinefile -n 5 hostname
>  > > mpi0
>  > > mpi2
>  > > mpi3
>  > > mpi1
>  > > mpi4
>  > > mpi at mpi0:~$
>  > >
>  > > but when I try to execute the program, the results are something
> like this
>  > >
>  > > mpi at mpi0:~$ mpiexec -f ./mpich2-install/machinefile -n 5
> /home/mpi/mpich2-install/workspace/Prueba/Debug/Prueba
>  > > Hello MPI World the original.
>  > > Hello MPI World the original.
>  > > Hello MPI World the original.
>  > > Hello MPI World the original.
>  > > Hello MPI World the original.
>  > > From process 0: Num processes: 5
>  > > Fatal error in MPI_Send: Other MPI error, error stack:
>  > > MPI_Send(173)..............: MPI_Send(buf=0xbfcbe268, count=26,
> MPI_CHAR, dest=0, tag=0, MPI_COMM_WORLD) failed
>  > > MPID_nem_tcp_connpoll(1826): Communication error with rank 0:
> Connection refused
>  > > Fatal error in MPI_Send: Other MPI error, error stack:
>  > > MPI_Send(173)..............: MPI_Send(buf=0xbfb32ca8, count=26,
> MPI_CHAR, dest=0, tag=0, MPI_COMM_WORLD) failed
>  > > MPID_nem_tcp_connpoll(1826): Communication error with rank 0:
> Connection refused
>  > > Fatal error in MPI_Send: Other MPI error, error stack:
>  > > MPI_Send(173)..............: MPI_Send(buf=0xbfa49e98, count=26,
> MPI_CHAR, dest=0, tag=0, MPI_COMM_WORLD) failed
>  > > MPID_nem_tcp_connpoll(1826): Communication error with rank 0:
> Connection refused
>  > > Fatal error in MPI_Send: Other MPI error, error stack:
>  > > MPI_Send(173)..............: MPI_Send(buf=0xbfa57538, count=26,
> MPI_CHAR, dest=0, tag=0, MPI_COMM_WORLD) failed
>  > > MPID_nem_tcp_connpoll(1826): Communication error with rank 0:
> Connection refused
>  > >
>  > > Do you have any idea what can be the problem?
>  > >
>  > > Thank you in advance
>  > > Miguel Angel
>  > >
>  > > _______________________________________________
>  > > mpich-discuss mailing list mpich-discuss at mcs.anl.gov
>  > > To manage subscription options or unsubscribe:
>  > > https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
>  >
>  > _______________________________________________
>  > mpich-discuss mailing list mpich-discuss at mcs.anl.gov
>  > To manage subscription options or unsubscribe:
>  > https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
>
>
> _______________________________________________
> mpich-discuss mailing list     mpich-discuss at mcs.anl.gov
> To manage subscription options or unsubscribe:
> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss

-- 
Pavan Balaji
http://www.mcs.anl.gov/~balaji


More information about the mpich-discuss mailing list