[MPICH] ssh failed to connect

Jorge Gonzalez jag2kn at gmail.com
Wed Jul 25 16:56:01 CDT 2007


On 7/25/07, Reuti <reuti at staff.uni-marburg.de> wrote:
>
> Hi,
>
> what about removing the 127.0.0.2 entry from /etc/hosts and giving
> server2 a sensible address therein. Maybe the started mpd on server4
> tries to connect to the sender, i.e. 127.0.0.2 which will fail.



I remove this line but cant see changes :S

-- Reuti
>
> Am 25.07.2007 um 02:12 schrieb Jorge Gonzalez:
>
> >
> >
> > On 7/24/07, Anthony Chan <chan at mcs.anl.gov> wrote:
> >
> > Did you try using "mpdcheck" to check if other network settings are
> > OK as
> > described in MPICH2 user's guide ?
> >
> > hi, thanks for the answer
> > sorry for the long mail :P
> >
> > I try launch in a "server2" machine the check, the .mpd.hosts file
> > are:
> >   server4
> >   server2
> >
> > this is the output:
> >
> > administrador at server2:~> mpdcheck
> > *** first ipaddr for this host (via server2) is: 127.0.0.2
> >
> > administrador at server2:~> mpdcheck -pc
> > --- print results of: gethostbyname_ex(gethostname())
> > ('server2', [], ['127.0.0.2'])
> > --- try to run /bin/hostname
> > server2
> > --- try to run uname -a
> > Linux server2 2.6.18.2-34-default #1 SMP Mon Nov 27 11:46:27 UTC
> > 2006 x86_64 x86_64 x86_64 GNU/Linux
> > --- try to print /etc/hosts
> > #
> > # hosts         This file describes a number of hostname-to-address
> > #               mappings for the TCP/IP subsystem.  It is mostly
> > #               used at boot time, when no name servers are running.
> > #               On small systems, this file can be used instead of a
> > #               "named" name server.
> > # Syntax:
> > #
> > # IP-Address  Full-Qualified-Hostname  Short-Hostname
> > #
> >
> > 127.0.0.2       server2
> > 127.0.0.1       localhost
> > XXX.XXX.123.25  server4
> > XXX.XXX.122.50  server1
> >
> > # special IPv6 addresses
> > ::1             localhost ipv6-localhost ipv6-loopback
> >
> > fe00::0         ipv6-localnet
> >
> > ff00::0         ipv6-mcastprefix
> > ff02::1         ipv6-allnodes
> > ff02::2         ipv6-allrouters
> > ff02::3         ipv6-allhosts
> > 127.0.0.2        server2 server2
> >
> > --- try to print /etc/resolv.conf
> > ### BEGIN INFO
> > #
> > # Modified_by:  dhcpcd
> > # Backup:       /etc/resolv.conf.saved.by.dhcpcd.eth0
> > # Process:      dhcpcd
> > # Process_id:   3847
> > # Script:       /sbin/modify_resolvconf
> > # Saveto:
> > # Info:         This is a temporary resolv.conf created by service
> > dhcpcd.
> > #               The previous file has been saved and will be
> > restored later.
> > #
> > #               If you don't like your resolv.conf to be changed, you
> > #               can set MODIFY_{RESOLV,NAMED}_CONF_DYNAMICALLY=no.
> > This
> > #               variables are placed in /etc/sysconfig/network/config.
> > #
> > #               You can also configure service dhcpcd not to modify
> > it.
> > #
> > #               If you don't like dhcpcd to change your nameserver
> > #               settings
> > #               then either set DHCLIENT_MODIFY_RESOLV_CONF=no
> > #               in /etc/sysconfig/network/dhcp, or
> > #               set MODIFY_RESOLV_CONF_DYNAMICALLY=no in
> > #               /etc/sysconfig/network/config or (manually) use dhcpcd
> > #               with -R.  If you only want to keep your searchlist,
> > set
> > #               DHCLIENT_KEEP_SEARCHLIST=yes in /etc/sysconfig/
> > network/dhcp or
> > #               (manually) use the -K option.
> > #
> > ### END INFO
> > search XXX.XXX.160.17 XXX.XXX.18 XXX.XXX.160.22 XXX.XXX.160.23
> > nameserver XXX.XXX.160.17
> > nameserver XXX.XXX.160.18
> > nameserver XXX.XXX.160.22
> > nameserver XXX.XXX.160.23
> > --- try to run /sbin/ifconfig -a
> > eth0      Link encap:Ethernet  HWaddr 00:18:8B:1E:1F:D6
> >           inet addr:XXX.XXX.123.136  Bcast:XXX.XXX.123.255  Mask:
> > 255.255.254.0
> >           inet6 addr: fe80::218:8bff:fe1e:1fd6/64 Scope:Link
> >           UP BROADCAST NOTRAILERS RUNNING MULTICAST  MTU:1500
> > Metric:1
> >           RX packets:18300 errors:3 dropped:0 overruns:0 frame:4
> >           TX packets:890 errors:0 dropped:0 overruns:0 carrier:0
> >           collisions:0 txqueuelen:1000
> >           RX bytes:2217723 (2.1 Mb)  TX bytes:134240 (131.0 Kb)
> >           Interrupt:169
> >
> > lo        Link encap:Local Loopback
> >           inet addr:127.0.0.1  Mask: 255.0.0.0
> >           inet6 addr: ::1/128 Scope:Host
> >           UP LOOPBACK RUNNING  MTU:16436  Metric:1
> >           RX packets:166 errors:0 dropped:0 overruns:0 frame:0
> >           TX packets:166 errors:0 dropped:0 overruns:0 carrier:0
> >           collisions:0 txqueuelen:0
> >           RX bytes:20311 (19.8 Kb)  TX bytes:20311 (19.8 Kb)
> >
> > sit0      Link encap:IPv6-in-IPv4
> >           NOARP  MTU:1480  Metric:1
> >           RX packets:0 errors:0 dropped:0 overruns:0 frame:0
> >           TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
> >           collisions:0 txqueuelen:0
> >           RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)
> >
> > --- try to print /etc/nsswitch.conf
> > #
> > # /etc/nsswitch.conf
> > #
> > # An example Name Service Switch config file. This file should be
> > # sorted with the most-used services at the beginning.
> > #
> > # The entry '[NOTFOUND=return]' means that the search for an
> > # entry should stop if the search in the previous entry turned
> > # up nothing. Note that if the search failed due to some other reason
> > # (like no NIS server responding) then the search continues with the
> > # next entry.
> > #
> > # Legal entries are:
> > #
> > #       compat                  Use compatibility setup
> > #       nisplus                 Use NIS+ (NIS version 3)
> > #       nis                     Use NIS (NIS version 2), also
> > called YP
> > #       dns                     Use DNS (Domain Name Service)
> > #       files                   Use the local files
> > #       [NOTFOUND=return]       Stop searching if not found so far
> > #
> > # For more information, please read the nsswitch.conf.5 manual page.
> > #
> >
> > # passwd: files nis
> > # shadow: files nis
> > # group:  files nis
> >
> > passwd: compat
> > group:  compat
> >
> > hosts:          files dns
> > networks:       files dns
> >
> > services:       files
> > protocols:      files
> > rpc:            files
> > ethers:         files
> > netmasks:       files
> > netgroup:       files nis
> > publickey:      files
> >
> > bootparams:     files
> > automount:      files nis
> > aliases:        files
> >
> >
> > administrador at server2:~> mpdcheck -f .mpd.hosts -ssh -v
> > obtaining hostname via gethostname and getfqdn
> > gethostname gives  server2
> > getfqdn gives  server2
> > checking out unqualified hostname; make sure is not "localhost", etc.
> > checking out qualified hostname; make sure is not "localhost", etc.
> > obtain IP addrs via qualified and unqualified hostnames;  make sure
> > other than 127.0.0.1
> > gethostbyname_ex:  ('server2', [], ['127.0.0.2'])
> > *** first ipaddr for this host (via server2) is: 127.0.0.2
> > gethostbyname_ex:  ('server2', [], ['127.0.0.2'])
> > checking that IP addrs resolve to same host
> > now do some gethostbyaddr and gethostbyname_ex for machines in
> > hosts file
> > checking gethostbyXXX for unqualified server4
> > gethostbyname_ex:  ('server4', [], ['XXX.XXX.123.25'])
> > checking gethostbyXXX for qualified server4
> > gethostbyname_ex:  ('server4', [], ['XXX.XXX.123.25'])
> > checking gethostbyXXX for unqualified server2
> > gethostbyname_ex:  ('server2', [], ['127.0.0.2'])
> > checking gethostbyXXX for qualified server2
> > gethostbyname_ex:  ('server2', [], ['127.0.0.2'])
> > trying: ssh server4 -x -n /bin/echo hello
> > trying: ssh server2 -x -n /bin/echo hello
> > starting server: /usr/local/bin/mpdcheck.py -s
> > starting client: ssh server4 -x -n /usr/local/bin/mpdcheck.py -c
> > server2 25734
> > ** timed out waiting for client on server4 to produce output
> > client on server4 failed to access the server
> >
> > after I try
> >    administrador at server2:~> ssh server4 -x -n /bin/echo helloJorge
> >
> > and the output are
> >    helloJorge
> >
> >
> > administrador at server2:~> mpdboot -f .mpd.hosts -n 2
> > mpdboot_server2 (handle_mpd_output 383): failed to connect to mpd
> > on server4
> > administrador at server2:~> mpdboot -f .mpd.hosts -n 2 -v
> > running mpdallexit on server2
> > LAUNCHED mpd on server2  via
> > RUNNING: mpd on server2
> > LAUNCHED mpd on server4  via  server2
> > mpdboot_server2 (handle_mpd_output 383): failed to connect to mpd
> > on server4
> >
> >
> > I dont know why failed the access :S
> >
> > thanks
> >
> > On Mon, 23 Jul 2007, Jorge Gonzalez wrote:
> >
> > > Hi all
> > >
> > > I'm configuring a cluster of Two Pc using Suse 10.2 x64,
> > Mpich2-1.0.5p4,
> > > OpenSSH_4.4p1
> > >
> > > I had configured succesfully  the ssh server on each machine.
> > > also I had configured the ssh clients with the command
> > > ssh server1  (without password)
> > > ssh server2  (without password)
> > >
> > > However when I tread to bring a ring of these two machines with
> > the command
> > > mpdbood -n 2 -f .mpd.hosts
> > >
> > > the following message is obtained are:
> > > mpdboot_server1 (handle_mpd_output 383): failed to connect to mpd
> > on server2
> > >
> > > can somebody tell me what I am doing wrong?
> > >
> > > the file .mpd.hosts contains the next two lines:
> > > server1
> > > server2
> > >
> > > I was to read this:
> > > http://www-unix.mcs.anl.gov/web-mail-archive/lists/mpich-discuss/
> > 2006/08/msg00009.html
> > > http://www-unix.mcs.anl.gov/web-mail-archive/lists/mpich-discuss/
> > 2006/04/msg00037.html
> > >
> > >
> > > Thanks for all
> > >
> > > --
> > > Jorge Andres Gonzalez
> > > jag2kn (at) gmail.com
> > > jagonalezce (at) unal.edu.co
> > > Universidad Nacional de Colombia
> > > Cel: 301 217 78 60
> > > Linux Counter 345082
> > > Bogotá     -    Colombia    -     Sur América
> > >
> >
> >
> >
> > --
> > JAG
> > jag2kn (at) gmail.com
> > Cel: 301 217 78 60
> > Linux Counter 345082
> > Bogotá     -    Colombia    -     Sur América
>
>


-- 
JAG
jag2kn (at) gmail.com
Cel: 301 217 78 60
Linux Counter 345082
Bogotá     -    Colombia    -     Sur América
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20070725/bd8cfe71/attachment.htm>


More information about the mpich-discuss mailing list