[MPICH] mpdboot error : fail to ping

Ralph Butler rbutler at mtsu.edu
Mon Apr 10 12:10:12 CDT 2006


Cool!

On Apr 10, 2006, at 12:06 PM, Misora Itsumo wrote:

> Thanks ralph , i have just make it run successfully . i did read  
> about the assumption of shared file system and duplication but i  
> thought that it was good enough to vi 2 identical .mpd.conf .
>
> Anyway , thanks to everybody.
> Tiep.
>
> On 4/10/06, Ralph M. Butler <rbutler at mtsu.edu> wrote: Yes this  
> might be a bit difficult to pick up on.
> The install guide discusses an assumption of shared file
> systems, e.g. via NFS, or the need to copy files.
> I can duplicate this problem only by having different secretwords
> on 2 machines.  Perhaps you can simply copy one file to the other
> to verify that the 2 secretwords are identical.
>
> > Date: Mon, 10 Apr 2006 05:47:35 +0700
> > From: Misora Itsumo < mitsuru.adachi at gmail.com>
> > To: Ralph Butler <rbutler at mtsu.edu>
> > Cc: mpich-discuss at mcs.anl.gov
> > Subject: Re: [MPICH] mpdboot error : fail to ping
> >
> > ah , because i dont know whether secrets in both node must be  
> similar or not
> > , i have already tried to make it the same , but it produced the  
> same error.
> >
> > Thanks,
> > Tiep.
> >
> > On 4/10/06, Misora Itsumo <mitsuru.adachi at gmail.com> wrote:
> >>
> >>
> >> yes , as ralph said , both my secretwords in two nodes are integer.
> >> i changed my secret and it didn't inform last error anymore,
> >> but sadly , it produced new error
> >>
> >> in node hewonty i did
> >> [hewonty at hewonty doc]$ mpd &
> >> [1] 4293
> >> [hewonty at hewonty doc]$ mpdtrace -l
> >> hewonty.homelinux.org_32800 (192.168.2.2)
> >> [hewonty at hewonty doc]$ hewonty.homelinux.org_32800 
> (handle_rhs_challenge_response 788): INVALID response in rhs response
> >> msg=:{'ifhn': '192.168.2.3', 'cmd': 'challenge_response',  
> 'port': 32774,
> >> 'response': 'X\xc5\x8f\x9ccfS\x8e\xaa\r\xde6$Y+\x81'}:
> >>
> >> in node vm1
> >> [hewonty at vm1 ~]$ mpd -h hewonty -p 32800
> >> vm1_32774 (connect_lhs 635): NOT OK to enter ring; one likely  
> cause:
> >> mismatched secretwords
> >> vm1_32774 (enter_ring 566): lhs connect failed
> >> vm1_32774 (run 233): failed to enter ring
> >>
> >> ah for testing , my secret for hewonty is asdfghjkl1 , for vm1 is
> >> qwertyuiop1
> >>
> >> Thanks,
> >> Tiep.
> >>
> >>
> >> On 4/9/06, Ralph Butler <rbutler at mtsu.edu> wrote:
> >>>
> >>> This seems to be a new bug.  I do not want to ask your secretword,
> >>> but will guess that it is an integer.   If so,
> >>> please make it a non-integer.  It's OK to have digits in there,  
> but
> >>> not to have the secretword be all digits.
> >>> Let me know if this fixes the problem and I will fix it in mpd for
> >>> the next release.
> >>>
> >>> Thanks.
> >>> --ralph
> >>>
> >>> On Apr 8, 2006, at 1:31 PM, Misora Itsumo wrote:
> >>>
> >>>> i have already tried mpdcheck .
> >>>>
> >>>> [hewonty at hewonty ~]$ mpdcheck -s
> >>>> server listening at INADDR_ANY on: hewonty 32775
> >>>> server has conn on <socket._socketobject object at 0xb7f7838c>  
> from
> >>>> (' 192.168.2.3', 56366)
> >>>> server successfully recvd msg from client:  
> hello_from_client_to_server
> >>>> [hewonty at vm1 ~]$ mpdcheck -c hewonty 32775
> >>>> client successfully recvd ack from server:  
> ack_from_server_to_client
> >>>>
> >>>> [hewonty at vm1 ~]$ mpdcheck -s
> >>>> server listening at INADDR_ANY on: vm1 32771
> >>>> server has conn on <socket._socketobject object at 0xb7f5920c>  
> from
> >>>> ('192.168.2.2 ', 33169)
> >>>> server successfully recvd msg from client:  
> hello_from_client_to_server
> >>>> [hewonty at hewonty ~]$ mpdcheck -c vm1 32771
> >>>> client successfully recvd ack from server:  
> ack_from_server_to_client
> >>>>
> >>>> The next thing , i tried to run mpd by hand , but got error like
> >>>> the last post.
> >>>>
> >>>> [hewonty at vm1 ~]$ mpd &
> >>>> [1] 2056
> >>>> [hewonty at vm1 ~]$ mpdtrace -l
> >>>> vm1_32772 ( 192.168.2.3)
> >>>>
> >>>>
> >>>> [hewonty at hewonty ~]$ mpd -h vm1 -p 32772
> >>>> hewonty_32846: mpd_uncaught_except_tb handling:
> >>>>   exceptions.TypeError : sequence item 0: expected string, int  
> found
> >>>>     /usr/local/mpich2/bin/mpdlib.py  627  connect_lhs
> >>>>         response = md5new(''.join([self.secretword,msg
> >>>> ['randnum']])).digest()
> >>>>     /usr/local/mpich2/bin/mpdlib.py  564  enter_ring
> >>>>         numTries=ntries)
> >>>>     /usr/local/mpich2/bin/mpd  231  run
> >>>>         rhsHandler=self.handle_rhs_input )
> >>>>     /usr/local/mpich2/bin/mpd  1344  ?
> >>>>         mpd.run()
> >>>>
> >>>> If i run mpd in hewonty first i got :
> >>>> [hewonty at hewonty ~]$ mpd &
> >>>> [1] 4051
> >>>> [hewonty at hewonty ~]$ mpdtrace -l
> >>>> hewonty_32781 (192.168.2.2)
> >>>>
> >>>> [hewonty at vm1 ~]$mpd -h hewonty -p 32781
> >>>> vm1_32776 (connect_lhs 621): invalid challenge from hewonty  
> 32781: {}
> >>>> vm1_32776 (enter_ring 566): lhs connect failed
> >>>> vm1_32776 (run 233): failed to enter ring
> >>>>
> >>>> and in hewonty i get the error
> >>>>
> >>>> hewonty.homelinux.org_32781: mpd_uncaught_except_tb handling:
> >>>>   exceptions.TypeError: sequence item 0: expected string, int  
> found
> >>>>     /usr/local/mpich2/bin/mpdlib.py  733
> >>>> handle_ring_listener_connection
> >>>>         newsock.correctChallengeResponse = \
> >>>>     /usr/local/mpich2/bin/mpdlib.py  488  handle_active_streams
> >>>>         handler(stream,*args)
> >>>>     /usr/local/mpich2/bin/mpd  266  runmainloop
> >>>>         rv = self.streamHandler.handle_active_streams  
> (timeout=8.0)
> >>>>     /usr/local/mpich2/bin/mpd  240  run
> >>>>         self.runmainloop()
> >>>>     /usr/local/mpich2/bin/mpd  1344  ?
> >>>>         mpd.run()
> >>>>
> >>>> [1]+  Exit 1                  mpd
> >>>>
> >>>>
> >>>> Regards,
> >>>> Tiep.
> >>>>
> >>>> On 4/8/06, Rajeev Thakur < thakur at mcs.anl.gov> wrote:
> >>>> Try running the mpdcheck troubleshooting utility as described in
> >>>> the installer's guide.
> >>>>
> >>>> Rajeev
> >>>> From: owner-mpich-discuss at mcs.anl.gov [mailto:owner-mpich-
> >>>> discuss at mcs.anl.gov] On Behalf Of Misora Itsumo
> >>>> Sent: Friday, April 07, 2006 5:59 PM
> >>>> To: mpich-discuss at mcs.anl.gov
> >>>>
> >>>> Subject: [MPICH] mpdboot error : fail to ping
> >>>>
> >>>> Hi
> >>>> i'm new to MPICH2 and i just installed mpich2 , but i can't  
> make it
> >>>> run on a set of machines.
> >>>>
> >>>> i run mpich2 on 2 nodes  , hostnames are hewonty and vm1.
> >>>> Here are some info
> >>>>
> >>>> [hewonty at hewonty ~]$ cat mpd.hosts
> >>>> hewonty
> >>>> vm1
> >>>>
> >>>> [hewonty at hewonty ~]$ cat /etc/hosts
> >>>> 127.0.0.1       localhost.localdomain   localhost
> >>>> 192.168.2.2     hewonty.homelinux.org   hewonty.vmnet1.org
> >>>> hewonty
> >>>> 192.168.2.3     vm1.hewonty.homelinux.org       vm1
> >>>> 192.168.2.2     svn_server
> >>>>
> >>>> [hewonty at hewonty ~]$ mpdboot -n 2 -f mpd.hosts
> >>>> mpdboot_hewonty (handle_mpd_output 359): failed to ping mpd on
> >>>> hewonty; recvd output={}
> >>>>
> >>>> i can ssh to hewonty or vm1.
> >>>>
> >>>> I tried to run mannually by mpd and here are what i got
> >>>>
> >>>> [hewonty at vm1 ~]$ mpd &
> >>>> [1] 2056
> >>>> [hewonty at vm1 ~]$ mpdtrace -l
> >>>> vm1_32772 (192.168.2.3)
> >>>>
> >>>>
> >>>> [hewonty at hewonty ~]$ mpd -h vm1 -p 32772
> >>>> hewonty_32846: mpd_uncaught_except_tb handling:
> >>>>   exceptions.TypeError: sequence item 0: expected string, int  
> found
> >>>>     /usr/local/mpich2/bin/mpdlib.py  627  connect_lhs
> >>>>         response = md5new(''.join([ self.secretword,msg
> >>>> ['randnum']])).digest()
> >>>>     /usr/local/mpich2/bin/mpdlib.py  564  enter_ring
> >>>>         numTries=ntries)
> >>>>     /usr/local/mpich2/bin/mpd  231  run
> >>>>         rhsHandler= self.handle_rhs_input)
> >>>>     /usr/local/mpich2/bin/mpd  1344  ?
> >>>>         mpd.run()
> >>>>
> >>>> Thanks in advance.
> >>>> Tiep
> >>>>
> >>>
> >>>
> >>
> >
>




More information about the mpich-discuss mailing list