[MPICH] ssh failed to connect

Jorge Gonzalez jag2kn at gmail.com
Tue Jul 24 19:12:45 CDT 2007


On 7/24/07, Anthony Chan <chan at mcs.anl.gov> wrote:

Did you try using "mpdcheck" to check if other network settings are OK as
> described in MPICH2 user's guide ?


hi, thanks for the answer
sorry for the long mail :P

I try launch in a "server2" machine the check, the .mpd.hosts file are:
  server4
  server2

this is the output:

administrador at server2:~> mpdcheck
*** first ipaddr for this host (via server2) is: 127.0.0.2

administrador at server2:~> mpdcheck -pc
--- print results of: gethostbyname_ex(gethostname())
('server2', [], ['127.0.0.2'])
--- try to run /bin/hostname
server2
--- try to run uname -a
Linux server2 2.6.18.2-34-default #1 SMP Mon Nov 27 11:46:27 UTC 2006 x86_64
x86_64 x86_64 GNU/Linux
--- try to print /etc/hosts
#
# hosts         This file describes a number of hostname-to-address
#               mappings for the TCP/IP subsystem.  It is mostly
#               used at boot time, when no name servers are running.
#               On small systems, this file can be used instead of a
#               "named" name server.
# Syntax:
#
# IP-Address  Full-Qualified-Hostname  Short-Hostname
#

127.0.0.2       server2
127.0.0.1       localhost
XXX.XXX.123.25  server4
XXX.XXX.122.50  server1

# special IPv6 addresses
::1             localhost ipv6-localhost ipv6-loopback

fe00::0         ipv6-localnet

ff00::0         ipv6-mcastprefix
ff02::1         ipv6-allnodes
ff02::2         ipv6-allrouters
ff02::3         ipv6-allhosts
127.0.0.2       server2 server2

--- try to print /etc/resolv.conf
### BEGIN INFO
#
# Modified_by:  dhcpcd
# Backup:       /etc/resolv.conf.saved.by.dhcpcd.eth0
# Process:      dhcpcd
# Process_id:   3847
# Script:       /sbin/modify_resolvconf
# Saveto:
# Info:         This is a temporary resolv.conf created by service dhcpcd.
#               The previous file has been saved and will be restored later.
#
#               If you don't like your resolv.conf to be changed, you
#               can set MODIFY_{RESOLV,NAMED}_CONF_DYNAMICALLY=no. This
#               variables are placed in /etc/sysconfig/network/config.
#
#               You can also configure service dhcpcd not to modify it.
#
#               If you don't like dhcpcd to change your nameserver
#               settings
#               then either set DHCLIENT_MODIFY_RESOLV_CONF=no
#               in /etc/sysconfig/network/dhcp, or
#               set MODIFY_RESOLV_CONF_DYNAMICALLY=no in
#               /etc/sysconfig/network/config or (manually) use dhcpcd
#               with -R.  If you only want to keep your searchlist, set
#               DHCLIENT_KEEP_SEARCHLIST=yes in /etc/sysconfig/network/dhcp
or
#               (manually) use the -K option.
#
### END INFO
search XXX.XXX.160.17 XXX.XXX.18 XXX.XXX.160.22 XXX.XXX.160.23
nameserver XXX.XXX.160.17
nameserver XXX.XXX.160.18
nameserver XXX.XXX.160.22
nameserver XXX.XXX.160.23
--- try to run /sbin/ifconfig -a
eth0      Link encap:Ethernet  HWaddr 00:18:8B:1E:1F:D6
          inet addr:XXX.XXX.123.136  Bcast:XXX.XXX.123.255  Mask:
255.255.254.0
          inet6 addr: fe80::218:8bff:fe1e:1fd6/64 Scope:Link
          UP BROADCAST NOTRAILERS RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:18300 errors:3 dropped:0 overruns:0 frame:4
          TX packets:890 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:2217723 (2.1 Mb)  TX bytes:134240 (131.0 Kb)
          Interrupt:169

lo        Link encap:Local Loopback
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:166 errors:0 dropped:0 overruns:0 frame:0
          TX packets:166 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:20311 (19.8 Kb)  TX bytes:20311 (19.8 Kb)

sit0      Link encap:IPv6-in-IPv4
          NOARP  MTU:1480  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)

--- try to print /etc/nsswitch.conf
#
# /etc/nsswitch.conf
#
# An example Name Service Switch config file. This file should be
# sorted with the most-used services at the beginning.
#
# The entry '[NOTFOUND=return]' means that the search for an
# entry should stop if the search in the previous entry turned
# up nothing. Note that if the search failed due to some other reason
# (like no NIS server responding) then the search continues with the
# next entry.
#
# Legal entries are:
#
#       compat                  Use compatibility setup
#       nisplus                 Use NIS+ (NIS version 3)
#       nis                     Use NIS (NIS version 2), also called YP
#       dns                     Use DNS (Domain Name Service)
#       files                   Use the local files
#       [NOTFOUND=return]       Stop searching if not found so far
#
# For more information, please read the nsswitch.conf.5 manual page.
#

# passwd: files nis
# shadow: files nis
# group:  files nis

passwd: compat
group:  compat

hosts:          files dns
networks:       files dns

services:       files
protocols:      files
rpc:            files
ethers:         files
netmasks:       files
netgroup:       files nis
publickey:      files

bootparams:     files
automount:      files nis
aliases:        files


administrador at server2:~> mpdcheck -f .mpd.hosts -ssh -v
obtaining hostname via gethostname and getfqdn
gethostname gives  server2
getfqdn gives  server2
checking out unqualified hostname; make sure is not "localhost", etc.
checking out qualified hostname; make sure is not "localhost", etc.
obtain IP addrs via qualified and unqualified hostnames;  make sure other
than 127.0.0.1
gethostbyname_ex:  ('server2', [], ['127.0.0.2'])
*** first ipaddr for this host (via server2) is: 127.0.0.2
gethostbyname_ex:  ('server2', [], ['127.0.0.2'])
checking that IP addrs resolve to same host
now do some gethostbyaddr and gethostbyname_ex for machines in hosts file
checking gethostbyXXX for unqualified server4
gethostbyname_ex:  ('server4', [], ['XXX.XXX.123.25'])
checking gethostbyXXX for qualified server4
gethostbyname_ex:  ('server4', [], ['XXX.XXX.123.25'])
checking gethostbyXXX for unqualified server2
gethostbyname_ex:  ('server2', [], ['127.0.0.2'])
checking gethostbyXXX for qualified server2
gethostbyname_ex:  ('server2', [], ['127.0.0.2'])
trying: ssh server4 -x -n /bin/echo hello
trying: ssh server2 -x -n /bin/echo hello
starting server: /usr/local/bin/mpdcheck.py -s
starting client: ssh server4 -x -n /usr/local/bin/mpdcheck.py -c server2
25734
** timed out waiting for client on server4 to produce output
client on server4 failed to access the server

after I try
   administrador at server2:~> ssh server4 -x -n /bin/echo helloJorge

and the output are
   helloJorge


administrador at server2:~> mpdboot -f .mpd.hosts -n 2
mpdboot_server2 (handle_mpd_output 383): failed to connect to mpd on server4
administrador at server2:~> mpdboot -f .mpd.hosts -n 2 -v
running mpdallexit on server2
LAUNCHED mpd on server2  via
RUNNING: mpd on server2
LAUNCHED mpd on server4  via  server2
mpdboot_server2 (handle_mpd_output 383): failed to connect to mpd on server4


I dont know why failed the access :S

thanks

On Mon, 23 Jul 2007, Jorge Gonzalez wrote:
>
> > Hi all
> >
> > I'm configuring a cluster of Two Pc using Suse 10.2 x64, Mpich2-1.0.5p4,
> > OpenSSH_4.4p1
> >
> > I had configured succesfully  the ssh server on each machine.
> > also I had configured the ssh clients with the command
> > ssh server1  (without password)
> > ssh server2  (without password)
> >
> > However when I tread to bring a ring of these two machines with the
> command
> > mpdbood -n 2 -f .mpd.hosts
> >
> > the following message is obtained are:
> > mpdboot_server1 (handle_mpd_output 383): failed to connect to mpd on
> server2
> >
> > can somebody tell me what I am doing wrong?
> >
> > the file .mpd.hosts contains the next two lines:
> > server1
> > server2
> >
> > I was to read this:
> >
> http://www-unix.mcs.anl.gov/web-mail-archive/lists/mpich-discuss/2006/08/msg00009.html
> >
> http://www-unix.mcs.anl.gov/web-mail-archive/lists/mpich-discuss/2006/04/msg00037.html
> >
> >
> > Thanks for all
> >
> > --
> > Jorge Andres Gonzalez
> > jag2kn (at) gmail.com
> > jagonalezce (at) unal.edu.co
> > Universidad Nacional de Colombia
> > Cel: 301 217 78 60
> > Linux Counter 345082
> > Bogotá     -    Colombia    -     Sur América
> >
>



-- 
JAG
jag2kn (at) gmail.com
Cel: 301 217 78 60
Linux Counter 345082
Bogotá     -    Colombia    -     Sur América
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20070725/38d96dda/attachment.htm>


More information about the mpich-discuss mailing list