[MPICH] ssh failed to connect
Jorge Gonzalez
jag2kn at gmail.com
Wed Jul 25 16:56:01 CDT 2007
On 7/25/07, Reuti <reuti at staff.uni-marburg.de> wrote:
>
> Hi,
>
> what about removing the 127.0.0.2 entry from /etc/hosts and giving
> server2 a sensible address therein. Maybe the started mpd on server4
> tries to connect to the sender, i.e. 127.0.0.2 which will fail.
I remove this line but cant see changes :S
-- Reuti
>
> Am 25.07.2007 um 02:12 schrieb Jorge Gonzalez:
>
> >
> >
> > On 7/24/07, Anthony Chan <chan at mcs.anl.gov> wrote:
> >
> > Did you try using "mpdcheck" to check if other network settings are
> > OK as
> > described in MPICH2 user's guide ?
> >
> > hi, thanks for the answer
> > sorry for the long mail :P
> >
> > I try launch in a "server2" machine the check, the .mpd.hosts file
> > are:
> > server4
> > server2
> >
> > this is the output:
> >
> > administrador at server2:~> mpdcheck
> > *** first ipaddr for this host (via server2) is: 127.0.0.2
> >
> > administrador at server2:~> mpdcheck -pc
> > --- print results of: gethostbyname_ex(gethostname())
> > ('server2', [], ['127.0.0.2'])
> > --- try to run /bin/hostname
> > server2
> > --- try to run uname -a
> > Linux server2 2.6.18.2-34-default #1 SMP Mon Nov 27 11:46:27 UTC
> > 2006 x86_64 x86_64 x86_64 GNU/Linux
> > --- try to print /etc/hosts
> > #
> > # hosts This file describes a number of hostname-to-address
> > # mappings for the TCP/IP subsystem. It is mostly
> > # used at boot time, when no name servers are running.
> > # On small systems, this file can be used instead of a
> > # "named" name server.
> > # Syntax:
> > #
> > # IP-Address Full-Qualified-Hostname Short-Hostname
> > #
> >
> > 127.0.0.2 server2
> > 127.0.0.1 localhost
> > XXX.XXX.123.25 server4
> > XXX.XXX.122.50 server1
> >
> > # special IPv6 addresses
> > ::1 localhost ipv6-localhost ipv6-loopback
> >
> > fe00::0 ipv6-localnet
> >
> > ff00::0 ipv6-mcastprefix
> > ff02::1 ipv6-allnodes
> > ff02::2 ipv6-allrouters
> > ff02::3 ipv6-allhosts
> > 127.0.0.2 server2 server2
> >
> > --- try to print /etc/resolv.conf
> > ### BEGIN INFO
> > #
> > # Modified_by: dhcpcd
> > # Backup: /etc/resolv.conf.saved.by.dhcpcd.eth0
> > # Process: dhcpcd
> > # Process_id: 3847
> > # Script: /sbin/modify_resolvconf
> > # Saveto:
> > # Info: This is a temporary resolv.conf created by service
> > dhcpcd.
> > # The previous file has been saved and will be
> > restored later.
> > #
> > # If you don't like your resolv.conf to be changed, you
> > # can set MODIFY_{RESOLV,NAMED}_CONF_DYNAMICALLY=no.
> > This
> > # variables are placed in /etc/sysconfig/network/config.
> > #
> > # You can also configure service dhcpcd not to modify
> > it.
> > #
> > # If you don't like dhcpcd to change your nameserver
> > # settings
> > # then either set DHCLIENT_MODIFY_RESOLV_CONF=no
> > # in /etc/sysconfig/network/dhcp, or
> > # set MODIFY_RESOLV_CONF_DYNAMICALLY=no in
> > # /etc/sysconfig/network/config or (manually) use dhcpcd
> > # with -R. If you only want to keep your searchlist,
> > set
> > # DHCLIENT_KEEP_SEARCHLIST=yes in /etc/sysconfig/
> > network/dhcp or
> > # (manually) use the -K option.
> > #
> > ### END INFO
> > search XXX.XXX.160.17 XXX.XXX.18 XXX.XXX.160.22 XXX.XXX.160.23
> > nameserver XXX.XXX.160.17
> > nameserver XXX.XXX.160.18
> > nameserver XXX.XXX.160.22
> > nameserver XXX.XXX.160.23
> > --- try to run /sbin/ifconfig -a
> > eth0 Link encap:Ethernet HWaddr 00:18:8B:1E:1F:D6
> > inet addr:XXX.XXX.123.136 Bcast:XXX.XXX.123.255 Mask:
> > 255.255.254.0
> > inet6 addr: fe80::218:8bff:fe1e:1fd6/64 Scope:Link
> > UP BROADCAST NOTRAILERS RUNNING MULTICAST MTU:1500
> > Metric:1
> > RX packets:18300 errors:3 dropped:0 overruns:0 frame:4
> > TX packets:890 errors:0 dropped:0 overruns:0 carrier:0
> > collisions:0 txqueuelen:1000
> > RX bytes:2217723 (2.1 Mb) TX bytes:134240 (131.0 Kb)
> > Interrupt:169
> >
> > lo Link encap:Local Loopback
> > inet addr:127.0.0.1 Mask: 255.0.0.0
> > inet6 addr: ::1/128 Scope:Host
> > UP LOOPBACK RUNNING MTU:16436 Metric:1
> > RX packets:166 errors:0 dropped:0 overruns:0 frame:0
> > TX packets:166 errors:0 dropped:0 overruns:0 carrier:0
> > collisions:0 txqueuelen:0
> > RX bytes:20311 (19.8 Kb) TX bytes:20311 (19.8 Kb)
> >
> > sit0 Link encap:IPv6-in-IPv4
> > NOARP MTU:1480 Metric:1
> > RX packets:0 errors:0 dropped:0 overruns:0 frame:0
> > TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
> > collisions:0 txqueuelen:0
> > RX bytes:0 (0.0 b) TX bytes:0 (0.0 b)
> >
> > --- try to print /etc/nsswitch.conf
> > #
> > # /etc/nsswitch.conf
> > #
> > # An example Name Service Switch config file. This file should be
> > # sorted with the most-used services at the beginning.
> > #
> > # The entry '[NOTFOUND=return]' means that the search for an
> > # entry should stop if the search in the previous entry turned
> > # up nothing. Note that if the search failed due to some other reason
> > # (like no NIS server responding) then the search continues with the
> > # next entry.
> > #
> > # Legal entries are:
> > #
> > # compat Use compatibility setup
> > # nisplus Use NIS+ (NIS version 3)
> > # nis Use NIS (NIS version 2), also
> > called YP
> > # dns Use DNS (Domain Name Service)
> > # files Use the local files
> > # [NOTFOUND=return] Stop searching if not found so far
> > #
> > # For more information, please read the nsswitch.conf.5 manual page.
> > #
> >
> > # passwd: files nis
> > # shadow: files nis
> > # group: files nis
> >
> > passwd: compat
> > group: compat
> >
> > hosts: files dns
> > networks: files dns
> >
> > services: files
> > protocols: files
> > rpc: files
> > ethers: files
> > netmasks: files
> > netgroup: files nis
> > publickey: files
> >
> > bootparams: files
> > automount: files nis
> > aliases: files
> >
> >
> > administrador at server2:~> mpdcheck -f .mpd.hosts -ssh -v
> > obtaining hostname via gethostname and getfqdn
> > gethostname gives server2
> > getfqdn gives server2
> > checking out unqualified hostname; make sure is not "localhost", etc.
> > checking out qualified hostname; make sure is not "localhost", etc.
> > obtain IP addrs via qualified and unqualified hostnames; make sure
> > other than 127.0.0.1
> > gethostbyname_ex: ('server2', [], ['127.0.0.2'])
> > *** first ipaddr for this host (via server2) is: 127.0.0.2
> > gethostbyname_ex: ('server2', [], ['127.0.0.2'])
> > checking that IP addrs resolve to same host
> > now do some gethostbyaddr and gethostbyname_ex for machines in
> > hosts file
> > checking gethostbyXXX for unqualified server4
> > gethostbyname_ex: ('server4', [], ['XXX.XXX.123.25'])
> > checking gethostbyXXX for qualified server4
> > gethostbyname_ex: ('server4', [], ['XXX.XXX.123.25'])
> > checking gethostbyXXX for unqualified server2
> > gethostbyname_ex: ('server2', [], ['127.0.0.2'])
> > checking gethostbyXXX for qualified server2
> > gethostbyname_ex: ('server2', [], ['127.0.0.2'])
> > trying: ssh server4 -x -n /bin/echo hello
> > trying: ssh server2 -x -n /bin/echo hello
> > starting server: /usr/local/bin/mpdcheck.py -s
> > starting client: ssh server4 -x -n /usr/local/bin/mpdcheck.py -c
> > server2 25734
> > ** timed out waiting for client on server4 to produce output
> > client on server4 failed to access the server
> >
> > after I try
> > administrador at server2:~> ssh server4 -x -n /bin/echo helloJorge
> >
> > and the output are
> > helloJorge
> >
> >
> > administrador at server2:~> mpdboot -f .mpd.hosts -n 2
> > mpdboot_server2 (handle_mpd_output 383): failed to connect to mpd
> > on server4
> > administrador at server2:~> mpdboot -f .mpd.hosts -n 2 -v
> > running mpdallexit on server2
> > LAUNCHED mpd on server2 via
> > RUNNING: mpd on server2
> > LAUNCHED mpd on server4 via server2
> > mpdboot_server2 (handle_mpd_output 383): failed to connect to mpd
> > on server4
> >
> >
> > I dont know why failed the access :S
> >
> > thanks
> >
> > On Mon, 23 Jul 2007, Jorge Gonzalez wrote:
> >
> > > Hi all
> > >
> > > I'm configuring a cluster of Two Pc using Suse 10.2 x64,
> > Mpich2-1.0.5p4,
> > > OpenSSH_4.4p1
> > >
> > > I had configured succesfully the ssh server on each machine.
> > > also I had configured the ssh clients with the command
> > > ssh server1 (without password)
> > > ssh server2 (without password)
> > >
> > > However when I tread to bring a ring of these two machines with
> > the command
> > > mpdbood -n 2 -f .mpd.hosts
> > >
> > > the following message is obtained are:
> > > mpdboot_server1 (handle_mpd_output 383): failed to connect to mpd
> > on server2
> > >
> > > can somebody tell me what I am doing wrong?
> > >
> > > the file .mpd.hosts contains the next two lines:
> > > server1
> > > server2
> > >
> > > I was to read this:
> > > http://www-unix.mcs.anl.gov/web-mail-archive/lists/mpich-discuss/
> > 2006/08/msg00009.html
> > > http://www-unix.mcs.anl.gov/web-mail-archive/lists/mpich-discuss/
> > 2006/04/msg00037.html
> > >
> > >
> > > Thanks for all
> > >
> > > --
> > > Jorge Andres Gonzalez
> > > jag2kn (at) gmail.com
> > > jagonalezce (at) unal.edu.co
> > > Universidad Nacional de Colombia
> > > Cel: 301 217 78 60
> > > Linux Counter 345082
> > > Bogotá - Colombia - Sur América
> > >
> >
> >
> >
> > --
> > JAG
> > jag2kn (at) gmail.com
> > Cel: 301 217 78 60
> > Linux Counter 345082
> > Bogotá - Colombia - Sur América
>
>
--
JAG
jag2kn (at) gmail.com
Cel: 301 217 78 60
Linux Counter 345082
Bogotá - Colombia - Sur América
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20070725/bd8cfe71/attachment.htm>
More information about the mpich-discuss
mailing list