[mpich-discuss] mpdboot failure

Cesar Covarrubias cesar at uci.edu
Tue Nov 18 13:14:58 CST 2008


Hello,

We are still having trouble getting mpd running on a large number of our
nodes. The error we get is mpdboot_bduc-sched.nacs.uci.edu
(handle_mpd_output 392): failed to handshake with mpd on bduc-i32-6;
recvd output={}. The problem is that it the node it chooses to fail on
is random. It is not always the same node. All nodes have
the /etc/mpd.conf file chmod'ed to 600. mpd is not running on any of the
nodes. The mpdcheck returns no errors (as you can see by the output
below). All the nodes are on a private subnet.

Ideas as to why this is failing?

Very Respectfully,
Cesar Covarrubias

-bash-3.2# mpdcheck -f mpd.hosts -ssh -v
obtaining hostname via gethostname and getfqdn
gethostname gives  bduc-sched.nacs.uci.edu
getfqdn gives  bduc-sched.nacs.uci.edu
checking out unqualified hostname; make sure is not "localhost", etc.
checking out qualified hostname; make sure is not "localhost", etc.
obtain IP addrs via qualified and unqualified hostnames;  make sure
other than 127.0.0.1
gethostbyname_ex:  ('bduc-sched.nacs.uci.edu', ['bduc-sched'],
['128.200.15.19'])
gethostbyname_ex:  ('bduc-sched.nacs.uci.edu', ['bduc-sched'],
['128.200.15.19'])
checking that IP addrs resolve to same host
now do some gethostbyaddr and gethostbyname_ex for machines in hosts
filechecking gethostbyXXX for unqualified bduc-i32-1
gethostbyname_ex:  ('bduc-i32-1', [], ['192.168.0.10'])
checking gethostbyXXX for qualified bduc-i32-1
gethostbyname_ex:  ('bduc-i32-1', [], ['192.168.0.10'])
checking gethostbyXXX for unqualified bduc-i32-3
gethostbyname_ex:  ('bduc-i32-3', [], ['192.168.0.12'])
checking gethostbyXXX for qualified bduc-i32-3
gethostbyname_ex:  ('bduc-i32-3', [], ['192.168.0.12'])
checking gethostbyXXX for unqualified bduc-i32-4
gethostbyname_ex:  ('bduc-i32-4', [], ['192.168.0.13'])
checking gethostbyXXX for qualified bduc-i32-4
gethostbyname_ex:  ('bduc-i32-4', [], ['192.168.0.13'])
checking gethostbyXXX for unqualified bduc-i32-5
gethostbyname_ex:  ('bduc-i32-5', [], ['192.168.0.14'])
checking gethostbyXXX for qualified bduc-i32-5
gethostbyname_ex:  ('bduc-i32-5', [], ['192.168.0.14'])
checking gethostbyXXX for unqualified bduc-i32-6
gethostbyname_ex:  ('bduc-i32-6', [], ['192.168.0.15'])
checking gethostbyXXX for qualified bduc-i32-6
gethostbyname_ex:  ('bduc-i32-6', [], ['192.168.0.15'])
checking gethostbyXXX for unqualified bduc-i32-7
gethostbyname_ex:  ('bduc-i32-7', [], ['192.168.0.16'])
checking gethostbyXXX for qualified bduc-i32-7
gethostbyname_ex:  ('bduc-i32-7', [], ['192.168.0.16'])
checking gethostbyXXX for unqualified bduc-i32-8
gethostbyname_ex:  ('bduc-i32-8', [], ['192.168.0.17'])
checking gethostbyXXX for qualified bduc-i32-8
gethostbyname_ex:  ('bduc-i32-8', [], ['192.168.0.17'])
checking gethostbyXXX for unqualified bduc-i32-9
gethostbyname_ex:  ('bduc-i32-9', [], ['192.168.0.18'])
checking gethostbyXXX for qualified bduc-i32-9
gethostbyname_ex:  ('bduc-i32-9', [], ['192.168.0.18'])
checking gethostbyXXX for unqualified bduc-i32-10
gethostbyname_ex:  ('bduc-i32-10', [], ['192.168.0.19'])
checking gethostbyXXX for qualified bduc-i32-10
gethostbyname_ex:  ('bduc-i32-10', [], ['192.168.0.19'])
checking gethostbyXXX for unqualified bduc-i32-11
gethostbyname_ex:  ('bduc-i32-11', [], ['192.168.0.20'])
checking gethostbyXXX for qualified bduc-i32-11
gethostbyname_ex:  ('bduc-i32-11', [], ['192.168.0.20'])
checking gethostbyXXX for unqualified bduc-i32-12
gethostbyname_ex:  ('bduc-i32-12', [], ['192.168.0.21'])
checking gethostbyXXX for qualified bduc-i32-12
gethostbyname_ex:  ('bduc-i32-12', [], ['192.168.0.21'])
checking gethostbyXXX for unqualified bduc-i32-13
gethostbyname_ex:  ('bduc-i32-13', [], ['192.168.0.22'])
checking gethostbyXXX for qualified bduc-i32-13
gethostbyname_ex:  ('bduc-i32-13', [], ['192.168.0.22'])
checking gethostbyXXX for unqualified bduc-i32-14
gethostbyname_ex:  ('bduc-i32-14', [], ['192.168.0.23'])
checking gethostbyXXX for qualified bduc-i32-14
gethostbyname_ex:  ('bduc-i32-14', [], ['192.168.0.23'])
checking gethostbyXXX for unqualified bduc-i32-15
gethostbyname_ex:  ('bduc-i32-15', [], ['192.168.0.24'])
checking gethostbyXXX for qualified bduc-i32-15
gethostbyname_ex:  ('bduc-i32-15', [], ['192.168.0.24'])
checking gethostbyXXX for unqualified bduc-i32-16
gethostbyname_ex:  ('bduc-i32-16', [], ['192.168.0.25'])
checking gethostbyXXX for qualified bduc-i32-16
gethostbyname_ex:  ('bduc-i32-16', [], ['192.168.0.25'])
checking gethostbyXXX for unqualified bduc-i32-17
gethostbyname_ex:  ('bduc-i32-17', [], ['192.168.0.26'])
checking gethostbyXXX for qualified bduc-i32-17
gethostbyname_ex:  ('bduc-i32-17', [], ['192.168.0.26'])
checking gethostbyXXX for unqualified bduc-i32-18
gethostbyname_ex:  ('bduc-i32-18', [], ['192.168.0.27'])
checking gethostbyXXX for qualified bduc-i32-18
gethostbyname_ex:  ('bduc-i32-18', [], ['192.168.0.27'])
checking gethostbyXXX for unqualified bduc-i32-19
gethostbyname_ex:  ('bduc-i32-19', [], ['192.168.0.28'])
checking gethostbyXXX for qualified bduc-i32-19
gethostbyname_ex:  ('bduc-i32-19', [], ['192.168.0.28'])
checking gethostbyXXX for unqualified bduc-i32-20
gethostbyname_ex:  ('bduc-i32-20', [], ['192.168.0.29'])
checking gethostbyXXX for qualified bduc-i32-20
gethostbyname_ex:  ('bduc-i32-20', [], ['192.168.0.29'])
checking gethostbyXXX for unqualified bduc-i32-21
gethostbyname_ex:  ('bduc-i32-21', [], ['192.168.0.30'])
checking gethostbyXXX for qualified bduc-i32-21
gethostbyname_ex:  ('bduc-i32-21', [], ['192.168.0.30'])
checking gethostbyXXX for unqualified bduc-i32-22
gethostbyname_ex:  ('bduc-i32-22', [], ['192.168.0.31'])
checking gethostbyXXX for qualified bduc-i32-22
gethostbyname_ex:  ('bduc-i32-22', [], ['192.168.0.31'])
checking gethostbyXXX for unqualified bduc-i32-24
gethostbyname_ex:  ('bduc-i32-24', [], ['192.168.0.33'])
checking gethostbyXXX for qualified bduc-i32-24
gethostbyname_ex:  ('bduc-i32-24', [], ['192.168.0.33'])
checking gethostbyXXX for unqualified bduc-i32-25
gethostbyname_ex:  ('bduc-i32-25', [], ['192.168.0.34'])
checking gethostbyXXX for qualified bduc-i32-25
gethostbyname_ex:  ('bduc-i32-25', [], ['192.168.0.34'])
checking gethostbyXXX for unqualified bduc-i32-27
gethostbyname_ex:  ('bduc-i32-27', [], ['192.168.0.36'])
checking gethostbyXXX for qualified bduc-i32-27
gethostbyname_ex:  ('bduc-i32-27', [], ['192.168.0.36'])
checking gethostbyXXX for unqualified bduc-i32-28
gethostbyname_ex:  ('bduc-i32-28', [], ['192.168.0.37'])
checking gethostbyXXX for qualified bduc-i32-28
gethostbyname_ex:  ('bduc-i32-28', [], ['192.168.0.37'])
checking gethostbyXXX for unqualified bduc-i32-29
gethostbyname_ex:  ('bduc-i32-29', [], ['192.168.0.38'])
checking gethostbyXXX for qualified bduc-i32-29
gethostbyname_ex:  ('bduc-i32-29', [], ['192.168.0.38'])
checking gethostbyXXX for unqualified bduc-i32-30
gethostbyname_ex:  ('bduc-i32-30', [], ['192.168.0.39'])
checking gethostbyXXX for qualified bduc-i32-30
gethostbyname_ex:  ('bduc-i32-30', [], ['192.168.0.39'])
checking gethostbyXXX for unqualified bduc-i32-31
gethostbyname_ex:  ('bduc-i32-31', [], ['192.168.0.40'])
checking gethostbyXXX for qualified bduc-i32-31
gethostbyname_ex:  ('bduc-i32-31', [], ['192.168.0.40'])
checking gethostbyXXX for unqualified bduc-i32-32
gethostbyname_ex:  ('bduc-i32-32', [], ['192.168.0.41'])
checking gethostbyXXX for qualified bduc-i32-32
gethostbyname_ex:  ('bduc-i32-32', [], ['192.168.0.41'])
checking gethostbyXXX for unqualified bduc-i32-33
gethostbyname_ex:  ('bduc-i32-33', [], ['192.168.0.42'])
checking gethostbyXXX for qualified bduc-i32-33
gethostbyname_ex:  ('bduc-i32-33', [], ['192.168.0.42'])
checking gethostbyXXX for unqualified bduc-i32-34
gethostbyname_ex:  ('bduc-i32-34', [], ['192.168.0.43'])
checking gethostbyXXX for qualified bduc-i32-34
gethostbyname_ex:  ('bduc-i32-34', [], ['192.168.0.43'])
checking gethostbyXXX for unqualified bduc-i32-35
gethostbyname_ex:  ('bduc-i32-35', [], ['192.168.0.44'])
checking gethostbyXXX for qualified bduc-i32-35
gethostbyname_ex:  ('bduc-i32-35', [], ['192.168.0.44'])
checking gethostbyXXX for unqualified bduc-i32-36
gethostbyname_ex:  ('bduc-i32-36', [], ['192.168.0.45'])
checking gethostbyXXX for qualified bduc-i32-36
gethostbyname_ex:  ('bduc-i32-36', [], ['192.168.0.45'])
checking gethostbyXXX for unqualified bduc-i32-37
gethostbyname_ex:  ('bduc-i32-37', [], ['192.168.0.46'])
checking gethostbyXXX for qualified bduc-i32-37
gethostbyname_ex:  ('bduc-i32-37', [], ['192.168.0.46'])
checking gethostbyXXX for unqualified bduc-i32-38
gethostbyname_ex:  ('bduc-i32-38', [], ['192.168.0.47'])
checking gethostbyXXX for qualified bduc-i32-38
gethostbyname_ex:  ('bduc-i32-38', [], ['192.168.0.47'])
checking gethostbyXXX for unqualified bduc-i32-39
gethostbyname_ex:  ('bduc-i32-39', [], ['192.168.0.48'])
checking gethostbyXXX for qualified bduc-i32-39
gethostbyname_ex:  ('bduc-i32-39', [], ['192.168.0.48'])
checking gethostbyXXX for unqualified bduc-i32-40
gethostbyname_ex:  ('bduc-i32-40', [], ['192.168.0.49'])
checking gethostbyXXX for qualified bduc-i32-40
gethostbyname_ex:  ('bduc-i32-40', [], ['192.168.0.49'])
trying: ssh bduc-i32-1 -x -n /bin/echo hello
trying: ssh bduc-i32-3 -x -n /bin/echo hello
trying: ssh bduc-i32-4 -x -n /bin/echo hello
trying: ssh bduc-i32-5 -x -n /bin/echo hello
trying: ssh bduc-i32-6 -x -n /bin/echo hello
trying: ssh bduc-i32-7 -x -n /bin/echo hello
trying: ssh bduc-i32-8 -x -n /bin/echo hello
trying: ssh bduc-i32-9 -x -n /bin/echo hello
trying: ssh bduc-i32-10 -x -n /bin/echo hello
trying: ssh bduc-i32-11 -x -n /bin/echo hello
trying: ssh bduc-i32-12 -x -n /bin/echo hello
trying: ssh bduc-i32-13 -x -n /bin/echo hello
trying: ssh bduc-i32-14 -x -n /bin/echo hello
trying: ssh bduc-i32-15 -x -n /bin/echo hello
trying: ssh bduc-i32-16 -x -n /bin/echo hello
trying: ssh bduc-i32-17 -x -n /bin/echo hello
trying: ssh bduc-i32-18 -x -n /bin/echo hello
trying: ssh bduc-i32-19 -x -n /bin/echo hello
trying: ssh bduc-i32-20 -x -n /bin/echo hello
trying: ssh bduc-i32-21 -x -n /bin/echo hello
trying: ssh bduc-i32-22 -x -n /bin/echo hello
trying: ssh bduc-i32-24 -x -n /bin/echo hello
trying: ssh bduc-i32-25 -x -n /bin/echo hello
trying: ssh bduc-i32-27 -x -n /bin/echo hello
trying: ssh bduc-i32-28 -x -n /bin/echo hello
trying: ssh bduc-i32-29 -x -n /bin/echo hello
trying: ssh bduc-i32-30 -x -n /bin/echo hello
trying: ssh bduc-i32-31 -x -n /bin/echo hello
trying: ssh bduc-i32-32 -x -n /bin/echo hello
trying: ssh bduc-i32-33 -x -n /bin/echo hello
trying: ssh bduc-i32-34 -x -n /bin/echo hello
trying: ssh bduc-i32-35 -x -n /bin/echo hello
trying: ssh bduc-i32-36 -x -n /bin/echo hello
trying: ssh bduc-i32-37 -x -n /bin/echo hello
trying: ssh bduc-i32-38 -x -n /bin/echo hello
trying: ssh bduc-i32-39 -x -n /bin/echo hello
trying: ssh bduc-i32-40 -x -n /bin/echo hello
starting server: /sge62/mpich2/bin/mpdcheck.py -s
starting client: ssh bduc-i32-1 -x -n /sge62/mpich2/bin/mpdcheck.py -c
bduc-sched.nacs.uci.edu 43076
starting server: /sge62/mpich2/bin/mpdcheck.py -s
starting client: ssh bduc-i32-3 -x -n /sge62/mpich2/bin/mpdcheck.py -c
bduc-sched.nacs.uci.edu 38649
starting server: /sge62/mpich2/bin/mpdcheck.py -s
starting client: ssh bduc-i32-4 -x -n /sge62/mpich2/bin/mpdcheck.py -c
bduc-sched.nacs.uci.edu 40831
starting server: /sge62/mpich2/bin/mpdcheck.py -s
starting client: ssh bduc-i32-5 -x -n /sge62/mpich2/bin/mpdcheck.py -c
bduc-sched.nacs.uci.edu 57153
starting server: /sge62/mpich2/bin/mpdcheck.py -s
starting client: ssh bduc-i32-6 -x -n /sge62/mpich2/bin/mpdcheck.py -c
bduc-sched.nacs.uci.edu 60407
starting server: /sge62/mpich2/bin/mpdcheck.py -s
starting client: ssh bduc-i32-7 -x -n /sge62/mpich2/bin/mpdcheck.py -c
bduc-sched.nacs.uci.edu 52646
starting server: /sge62/mpich2/bin/mpdcheck.py -s
starting client: ssh bduc-i32-8 -x -n /sge62/mpich2/bin/mpdcheck.py -c
bduc-sched.nacs.uci.edu 47523
starting server: /sge62/mpich2/bin/mpdcheck.py -s
starting client: ssh bduc-i32-9 -x -n /sge62/mpich2/bin/mpdcheck.py -c
bduc-sched.nacs.uci.edu 50172
starting server: /sge62/mpich2/bin/mpdcheck.py -s
starting client: ssh bduc-i32-10 -x -n /sge62/mpich2/bin/mpdcheck.py -c
bduc-sched.nacs.uci.edu 51821
starting server: /sge62/mpich2/bin/mpdcheck.py -s
starting client: ssh bduc-i32-11 -x -n /sge62/mpich2/bin/mpdcheck.py -c
bduc-sched.nacs.uci.edu 36082
starting server: /sge62/mpich2/bin/mpdcheck.py -s
starting client: ssh bduc-i32-12 -x -n /sge62/mpich2/bin/mpdcheck.py -c
bduc-sched.nacs.uci.edu 57250
starting server: /sge62/mpich2/bin/mpdcheck.py -s
starting client: ssh bduc-i32-13 -x -n /sge62/mpich2/bin/mpdcheck.py -c
bduc-sched.nacs.uci.edu 47474
starting server: /sge62/mpich2/bin/mpdcheck.py -s
starting client: ssh bduc-i32-14 -x -n /sge62/mpich2/bin/mpdcheck.py -c
bduc-sched.nacs.uci.edu 51971
starting server: /sge62/mpich2/bin/mpdcheck.py -s
starting client: ssh bduc-i32-15 -x -n /sge62/mpich2/bin/mpdcheck.py -c
bduc-sched.nacs.uci.edu 54481
starting server: /sge62/mpich2/bin/mpdcheck.py -s
starting client: ssh bduc-i32-16 -x -n /sge62/mpich2/bin/mpdcheck.py -c
bduc-sched.nacs.uci.edu 49851
starting server: /sge62/mpich2/bin/mpdcheck.py -s
starting client: ssh bduc-i32-17 -x -n /sge62/mpich2/bin/mpdcheck.py -c
bduc-sched.nacs.uci.edu 33898
starting server: /sge62/mpich2/bin/mpdcheck.py -s
starting client: ssh bduc-i32-18 -x -n /sge62/mpich2/bin/mpdcheck.py -c
bduc-sched.nacs.uci.edu 56621
starting server: /sge62/mpich2/bin/mpdcheck.py -s
starting client: ssh bduc-i32-19 -x -n /sge62/mpich2/bin/mpdcheck.py -c
bduc-sched.nacs.uci.edu 37119
starting server: /sge62/mpich2/bin/mpdcheck.py -s
starting client: ssh bduc-i32-20 -x -n /sge62/mpich2/bin/mpdcheck.py -c
bduc-sched.nacs.uci.edu 38601
starting server: /sge62/mpich2/bin/mpdcheck.py -s
starting client: ssh bduc-i32-21 -x -n /sge62/mpich2/bin/mpdcheck.py -c
bduc-sched.nacs.uci.edu 43570
starting server: /sge62/mpich2/bin/mpdcheck.py -s
starting client: ssh bduc-i32-22 -x -n /sge62/mpich2/bin/mpdcheck.py -c
bduc-sched.nacs.uci.edu 42130
starting server: /sge62/mpich2/bin/mpdcheck.py -s
starting client: ssh bduc-i32-24 -x -n /sge62/mpich2/bin/mpdcheck.py -c
bduc-sched.nacs.uci.edu 41582
starting server: /sge62/mpich2/bin/mpdcheck.py -s
starting client: ssh bduc-i32-25 -x -n /sge62/mpich2/bin/mpdcheck.py -c
bduc-sched.nacs.uci.edu 50515
starting server: /sge62/mpich2/bin/mpdcheck.py -s
starting client: ssh bduc-i32-27 -x -n /sge62/mpich2/bin/mpdcheck.py -c
bduc-sched.nacs.uci.edu 48864
starting server: /sge62/mpich2/bin/mpdcheck.py -s
starting client: ssh bduc-i32-28 -x -n /sge62/mpich2/bin/mpdcheck.py -c
bduc-sched.nacs.uci.edu 47361
starting server: /sge62/mpich2/bin/mpdcheck.py -s
starting client: ssh bduc-i32-29 -x -n /sge62/mpich2/bin/mpdcheck.py -c
bduc-sched.nacs.uci.edu 44604
starting server: /sge62/mpich2/bin/mpdcheck.py -s
starting client: ssh bduc-i32-30 -x -n /sge62/mpich2/bin/mpdcheck.py -c
bduc-sched.nacs.uci.edu 41417
starting server: /sge62/mpich2/bin/mpdcheck.py -s
starting client: ssh bduc-i32-31 -x -n /sge62/mpich2/bin/mpdcheck.py -c
bduc-sched.nacs.uci.edu 48330
starting server: /sge62/mpich2/bin/mpdcheck.py -s
starting client: ssh bduc-i32-32 -x -n /sge62/mpich2/bin/mpdcheck.py -c
bduc-sched.nacs.uci.edu 47243
starting server: /sge62/mpich2/bin/mpdcheck.py -s
starting client: ssh bduc-i32-33 -x -n /sge62/mpich2/bin/mpdcheck.py -c
bduc-sched.nacs.uci.edu 35428
starting server: /sge62/mpich2/bin/mpdcheck.py -s
starting client: ssh bduc-i32-34 -x -n /sge62/mpich2/bin/mpdcheck.py -c
bduc-sched.nacs.uci.edu 59147
starting server: /sge62/mpich2/bin/mpdcheck.py -s
starting client: ssh bduc-i32-35 -x -n /sge62/mpich2/bin/mpdcheck.py -c
bduc-sched.nacs.uci.edu 54136
starting server: /sge62/mpich2/bin/mpdcheck.py -s
starting client: ssh bduc-i32-36 -x -n /sge62/mpich2/bin/mpdcheck.py -c
bduc-sched.nacs.uci.edu 56116
starting server: /sge62/mpich2/bin/mpdcheck.py -s
starting client: ssh bduc-i32-37 -x -n /sge62/mpich2/bin/mpdcheck.py -c
bduc-sched.nacs.uci.edu 59475
starting server: /sge62/mpich2/bin/mpdcheck.py -s
starting client: ssh bduc-i32-38 -x -n /sge62/mpich2/bin/mpdcheck.py -c
bduc-sched.nacs.uci.edu 52834
starting server: /sge62/mpich2/bin/mpdcheck.py -s
starting client: ssh bduc-i32-39 -x -n /sge62/mpich2/bin/mpdcheck.py -c
bduc-sched.nacs.uci.edu 58080
starting server: /sge62/mpich2/bin/mpdcheck.py -s
starting client: ssh bduc-i32-40 -x -n /sge62/mpich2/bin/mpdcheck.py -c
bduc-sched.nacs.uci.edu 57115





More information about the mpich-discuss mailing list