[mpich-discuss] FW: MPID_nem_tcp_connpoll(1826): Communication error with rank 0: Connection refused
Miguel Angel Fernández
mafga74 at hotmail.com
Fri Nov 4 03:57:23 CDT 2011
Hi
Pavan
I thought I answered your email,...
anyway, I'm doing to much things at the same time ;-)
Yes, all machines can communicate to each
others.
Attached, you have the output of the
commands "configure", "make" and "make install"
for users "mpi" and "root".
I sent to your personal email "balaji at mcs.anl.gov" one document with the configuration I am using. Maybe you can find
the thing I am doing wrong.
Thanks for your time
Miguel Angel
> Date: Thu, 3 Nov 2011 18:40:24 -0500
> From: balaji at mcs.anl.gov
> To: mpich-discuss at mcs.anl.gov
> CC: mafga74 at hotmail.com
> Subject: Re: [mpich-discuss] MPID_nem_tcp_connpoll(1826): Communication error with rank 0: Connection refused
>
> Hi,
>
> You never responded to the previous email I sent:
>
> Is every machine able to connect to every other machine (not just mpi0
> to every other machine)?
>
> Btw, Hydra doesn't need any additional password like mpd, but you need
> to make sure that you can ssh to every machine and that every machine
> can connect to every other machine.
>
> -- Pavan
>
> On 11/03/2011 02:53 AM, Miguel Angel Fernández wrote:
> > Hello again
> >
> > I'm still trying to fix the problem I told you days ago.
> > I configured all cluster machines for executing mpiexec as root on any
> > cases and the problem is the same.
> > Is it necesary to configure a password for hydra as I had to do for mpd?
> >
> > Thank you in advance
> > Miguel Ángel
> >
> > > From: thakur at mcs.anl.gov
> > > Date: Sat, 22 Oct 2011 15:30:26 -0500
> > > To: mpich-discuss at mcs.anl.gov
> > > Subject: Re: [mpich-discuss] MPID_nem_tcp_connpoll(1826):
> > Communication error with rank 0: Connection refused
> > >
> > > Make sure the 5 machines can communicate with each other, i.e., there
> > is no firewall preventing connections.
> > >
> > > Rajeev
> > >
> > > On Oct 22, 2011, at 12:36 PM, Miguel Angel Fernández wrote:
> > >
> > > > Hello everybody
> > > >
> > > > I'm trying to fix a problem that appear when I execute one of the
> > mpich2 program examples.
> > > > As you can see, if I execute a normal command there are no
> > problems. The cluster work properly.
> > > >
> > > > mpi at mpi0:~$ mpiexec -f ./mpich2-install/machinefile -n 5 hostname
> > > > mpi0
> > > > mpi2
> > > > mpi3
> > > > mpi1
> > > > mpi4
> > > > mpi at mpi0:~$
> > > >
> > > > but when I try to execute the program, the results are something
> > like this
> > > >
> > > > mpi at mpi0:~$ mpiexec -f ./mpich2-install/machinefile -n 5
> > /home/mpi/mpich2-install/workspace/Prueba/Debug/Prueba
> > > > Hello MPI World the original.
> > > > Hello MPI World the original.
> > > > Hello MPI World the original.
> > > > Hello MPI World the original.
> > > > Hello MPI World the original.
> > > > From process 0: Num processes: 5
> > > > Fatal error in MPI_Send: Other MPI error, error stack:
> > > > MPI_Send(173)..............: MPI_Send(buf=0xbfcbe268, count=26,
> > MPI_CHAR, dest=0, tag=0, MPI_COMM_WORLD) failed
> > > > MPID_nem_tcp_connpoll(1826): Communication error with rank 0:
> > Connection refused
> > > > Fatal error in MPI_Send: Other MPI error, error stack:
> > > > MPI_Send(173)..............: MPI_Send(buf=0xbfb32ca8, count=26,
> > MPI_CHAR, dest=0, tag=0, MPI_COMM_WORLD) failed
> > > > MPID_nem_tcp_connpoll(1826): Communication error with rank 0:
> > Connection refused
> > > > Fatal error in MPI_Send: Other MPI error, error stack:
> > > > MPI_Send(173)..............: MPI_Send(buf=0xbfa49e98, count=26,
> > MPI_CHAR, dest=0, tag=0, MPI_COMM_WORLD) failed
> > > > MPID_nem_tcp_connpoll(1826): Communication error with rank 0:
> > Connection refused
> > > > Fatal error in MPI_Send: Other MPI error, error stack:
> > > > MPI_Send(173)..............: MPI_Send(buf=0xbfa57538, count=26,
> > MPI_CHAR, dest=0, tag=0, MPI_COMM_WORLD) failed
> > > > MPID_nem_tcp_connpoll(1826): Communication error with rank 0:
> > Connection refused
> > > >
> > > > Do you have any idea what can be the problem?
> > > >
> > > > Thank you in advance
> > > > Miguel Angel
> > > >
> > > > _______________________________________________
> > > > mpich-discuss mailing list mpich-discuss at mcs.anl.gov
> > > > To manage subscription options or unsubscribe:
> > > > https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
> > >
> > > _______________________________________________
> > > mpich-discuss mailing list mpich-discuss at mcs.anl.gov
> > > To manage subscription options or unsubscribe:
> > > https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
> >
> >
> > _______________________________________________
> > mpich-discuss mailing list mpich-discuss at mcs.anl.gov
> > To manage subscription options or unsubscribe:
> > https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
>
> --
> Pavan Balaji
> http://www.mcs.anl.gov/~balaji
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20111104/6909f143/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: configure.out
Type: application/octet-stream
Size: 115008 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20111104/6909f143/attachment-0006.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: configure_root.out
Type: application/octet-stream
Size: 115792 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20111104/6909f143/attachment-0007.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: make.out
Type: application/octet-stream
Size: 115274 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20111104/6909f143/attachment-0008.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: make_install.out
Type: application/octet-stream
Size: 89159 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20111104/6909f143/attachment-0009.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: make_install_root.out
Type: application/octet-stream
Size: 103421 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20111104/6909f143/attachment-0010.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: make_root.out
Type: application/octet-stream
Size: 45525 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20111104/6909f143/attachment-0011.obj>
More information about the mpich-discuss
mailing list