Thanks ralph , i have just make it run successfully . i did read about
the assumption of shared file system and duplication but i thought that
it was good enough to vi 2 identical .mpd.conf .<br>
<br>
Anyway , thanks to everybody.<br>
Tiep.<br><br><div><span class="gmail_quote">On 4/10/06, <b class="gmail_sendername">Ralph M. Butler</b> <<a href="mailto:rbutler@mtsu.edu">rbutler@mtsu.edu</a>> wrote:</span><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
Yes this might be a bit difficult to pick up on.<br>The install guide discusses an assumption of shared file<br>systems, e.g. via NFS, or the need to copy files.<br>I can duplicate this problem only by having different secretwords
<br>on 2 machines. Perhaps you can simply copy one file to the other<br>to verify that the 2 secretwords are identical.<br><br>> Date: Mon, 10 Apr 2006 05:47:35 +0700<br>> From: Misora Itsumo <<a href="mailto:mitsuru.adachi@gmail.com">
mitsuru.adachi@gmail.com</a>><br>> To: Ralph Butler <<a href="mailto:rbutler@mtsu.edu">rbutler@mtsu.edu</a>><br>> Cc: <a href="mailto:mpich-discuss@mcs.anl.gov">mpich-discuss@mcs.anl.gov</a><br>> Subject: Re: [MPICH] mpdboot error : fail to ping
<br>><br>> ah , because i dont know whether secrets in both node must be similar or not<br>> , i have already tried to make it the same , but it produced the same error.<br>><br>> Thanks,<br>> Tiep.<br>>
<br>> On 4/10/06, Misora Itsumo <<a href="mailto:mitsuru.adachi@gmail.com">mitsuru.adachi@gmail.com</a>> wrote:<br>>><br>>><br>>> yes , as ralph said , both my secretwords in two nodes are integer.
<br>>> i changed my secret and it didn't inform last error anymore,<br>>> but sadly , it produced new error<br>>><br>>> in node hewonty i did<br>>> [hewonty@hewonty doc]$ mpd &<br>>> [1] 4293
<br>>> [hewonty@hewonty doc]$ mpdtrace -l<br>>> hewonty.homelinux.org_32800 (<a href="http://192.168.2.2">192.168.2.2</a>)<br>>> [hewonty@hewonty doc]$ hewonty.homelinux.org_32800(handle_rhs_challenge_response 788): INVALID response in rhs response
<br>>> msg=:{'ifhn': '<a href="http://192.168.2.3">192.168.2.3</a>', 'cmd': 'challenge_response', 'port': 32774,<br>>> 'response': 'X\xc5\x8f\x9ccfS\x8e\xaa\r\xde6$Y+\x81'}:<br>>><br>>> in node vm1
<br>>> [hewonty@vm1 ~]$ mpd -h hewonty -p 32800<br>>> vm1_32774 (connect_lhs 635): NOT OK to enter ring; one likely cause:<br>>> mismatched secretwords<br>>> vm1_32774 (enter_ring 566): lhs connect failed
<br>>> vm1_32774 (run 233): failed to enter ring<br>>><br>>> ah for testing , my secret for hewonty is asdfghjkl1 , for vm1 is<br>>> qwertyuiop1<br>>><br>>> Thanks,<br>>> Tiep.<br>
>><br>>><br>>> On 4/9/06, Ralph Butler <<a href="mailto:rbutler@mtsu.edu">rbutler@mtsu.edu</a>> wrote:<br>>>><br>>>> This seems to be a new bug. I do not want to ask your secretword,
<br>>>> but will guess that it is an integer. If so,<br>>>> please make it a non-integer. It's OK to have digits in there, but<br>>>> not to have the secretword be all digits.<br>>>> Let me know if this fixes the problem and I will fix it in mpd for
<br>>>> the next release.<br>>>><br>>>> Thanks.<br>>>> --ralph<br>>>><br>>>> On Apr 8, 2006, at 1:31 PM, Misora Itsumo wrote:<br>>>><br>>>>> i have already tried mpdcheck .
<br>>>>><br>>>>> [hewonty@hewonty ~]$ mpdcheck -s<br>>>>> server listening at INADDR_ANY on: hewonty 32775<br>>>>> server has conn on <socket._socketobject object at 0xb7f7838c> from
<br>>>>> (' <a href="http://192.168.2.3">192.168.2.3</a>', 56366)<br>>>>> server successfully recvd msg from client: hello_from_client_to_server<br>>>>> [hewonty@vm1 ~]$ mpdcheck -c hewonty 32775
<br>>>>> client successfully recvd ack from server: ack_from_server_to_client<br>>>>><br>>>>> [hewonty@vm1 ~]$ mpdcheck -s<br>>>>> server listening at INADDR_ANY on: vm1 32771
<br>>>>> server has conn on <socket._socketobject object at 0xb7f5920c> from<br>>>>> ('<a href="http://192.168.2.2">192.168.2.2</a> ', 33169)<br>>>>> server successfully recvd msg from client: hello_from_client_to_server
<br>>>>> [hewonty@hewonty ~]$ mpdcheck -c vm1 32771<br>>>>> client successfully recvd ack from server: ack_from_server_to_client<br>>>>><br>>>>> The next thing , i tried to run mpd by hand , but got error like
<br>>>>> the last post.<br>>>>><br>>>>> [hewonty@vm1 ~]$ mpd &<br>>>>> [1] 2056<br>>>>> [hewonty@vm1 ~]$ mpdtrace -l<br>>>>> vm1_32772 ( <a href="http://192.168.2.3">
192.168.2.3</a>)<br>>>>><br>>>>><br>>>>> [hewonty@hewonty ~]$ mpd -h vm1 -p 32772<br>>>>> hewonty_32846: mpd_uncaught_except_tb handling:<br>>>>> exceptions.TypeError
: sequence item 0: expected string, int found<br>>>>> /usr/local/mpich2/bin/mpdlib.py 627 connect_lhs<br>>>>> response = md5new(''.join([self.secretword,msg<br>>>>> ['randnum']])).digest()
<br>>>>> /usr/local/mpich2/bin/mpdlib.py 564 enter_ring<br>>>>> numTries=ntries)<br>>>>> /usr/local/mpich2/bin/mpd 231 run<br>>>>> rhsHandler=self.handle_rhs_input
)<br>>>>> /usr/local/mpich2/bin/mpd 1344 ?<br>>>>> mpd.run()<br>>>>><br>>>>> If i run mpd in hewonty first i got :<br>>>>> [hewonty@hewonty ~]$ mpd &
<br>>>>> [1] 4051<br>>>>> [hewonty@hewonty ~]$ mpdtrace -l<br>>>>> hewonty_32781 (<a href="http://192.168.2.2">192.168.2.2</a>)<br>>>>><br>>>>> [hewonty@vm1 ~]$mpd -h hewonty -p 32781
<br>>>>> vm1_32776 (connect_lhs 621): invalid challenge from hewonty 32781: {}<br>>>>> vm1_32776 (enter_ring 566): lhs connect failed<br>>>>> vm1_32776 (run 233): failed to enter ring<br>
>>>><br>>>>> and in hewonty i get the error<br>>>>><br>>>>> hewonty.homelinux.org_32781: mpd_uncaught_except_tb handling:<br>>>>> exceptions.TypeError: sequence item 0: expected string, int found
<br>>>>> /usr/local/mpich2/bin/mpdlib.py 733<br>>>>> handle_ring_listener_connection<br>>>>> newsock.correctChallengeResponse = \<br>>>>> /usr/local/mpich2/bin/mpdlib.py 488 handle_active_streams
<br>>>>> handler(stream,*args)<br>>>>> /usr/local/mpich2/bin/mpd 266 runmainloop<br>>>>> rv = self.streamHandler.handle_active_streams (timeout=8.0)<br>>>>> /usr/local/mpich2/bin/mpd 240 run
<br>>>>> self.runmainloop()<br>>>>> /usr/local/mpich2/bin/mpd 1344 ?<br>>>>> mpd.run()<br>>>>><br>>>>>
[1]+ Exit
1 mpd<br>>>>><br>>>>><br>>>>> Regards,<br>>>>> Tiep.<br>>>>><br>>>>> On 4/8/06, Rajeev Thakur <<a href="mailto:thakur@mcs.anl.gov">
thakur@mcs.anl.gov</a>> wrote:<br>>>>> Try running the mpdcheck troubleshooting utility as described in<br>>>>> the installer's guide.<br>>>>><br>>>>> Rajeev<br>>>>> From:
<a href="mailto:owner-mpich-discuss@mcs.anl.gov">owner-mpich-discuss@mcs.anl.gov</a> [mailto:<a href="mailto:owner-mpich-">owner-mpich-</a><br>>>>> <a href="mailto:discuss@mcs.anl.gov">discuss@mcs.anl.gov</a>] On Behalf Of Misora Itsumo
<br>>>>> Sent: Friday, April 07, 2006 5:59 PM<br>>>>> To: <a href="mailto:mpich-discuss@mcs.anl.gov">mpich-discuss@mcs.anl.gov</a><br>>>>><br>>>>> Subject: [MPICH] mpdboot error : fail to ping
<br>>>>><br>>>>> Hi<br>>>>> i'm new to MPICH2 and i just installed mpich2 , but i can't make it<br>>>>> run on a set of machines.<br>>>>><br>>>>> i run mpich2 on 2 nodes , hostnames are hewonty and vm1.
<br>>>>> Here are some info<br>>>>><br>>>>> [hewonty@hewonty ~]$ cat mpd.hosts<br>>>>> hewonty<br>>>>> vm1<br>>>>><br>>>>> [hewonty@hewonty
~]$ cat /etc/hosts<br>>>>> <a href="http://127.0.0.1">127.0.0.1</a> localhost.localdomain localhost<br>>>>> <a href="http://192.168.2.2">192.168.2.2</a> <a href="http://hewonty.homelinux.org">
hewonty.homelinux.org</a> <a href="http://hewonty.vmnet1.org">hewonty.vmnet1.org</a><br>>>>> hewonty<br>>>>> <a href="http://192.168.2.3">192.168.2.3</a> <a href="http://vm1.hewonty.homelinux.org">
vm1.hewonty.homelinux.org</a> vm1<br>>>>> <a href="http://192.168.2.2">192.168.2.2</a> svn_server<br>>>>><br>>>>> [hewonty@hewonty ~]$ mpdboot -n 2 -f mpd.hosts<br>>>>> mpdboot_hewonty (handle_mpd_output 359): failed to ping mpd on
<br>>>>> hewonty; recvd output={}<br>>>>><br>>>>> i can ssh to hewonty or vm1.<br>>>>><br>>>>> I tried to run mannually by mpd and here are what i got<br>>>>>
<br>>>>> [hewonty@vm1 ~]$ mpd &<br>>>>> [1] 2056<br>>>>> [hewonty@vm1 ~]$ mpdtrace -l<br>>>>> vm1_32772 (<a href="http://192.168.2.3">192.168.2.3</a>)<br>>>>>
<br>>>>><br>>>>> [hewonty@hewonty ~]$ mpd -h vm1 -p 32772<br>>>>> hewonty_32846: mpd_uncaught_except_tb handling:<br>>>>> exceptions.TypeError: sequence item 0: expected string, int found
<br>>>>> /usr/local/mpich2/bin/mpdlib.py 627 connect_lhs<br>>>>> response = md5new(''.join([ self.secretword,msg<br>>>>> ['randnum']])).digest()<br>>>>> /usr/local/mpich2/bin/mpdlib.py 564 enter_ring
<br>>>>> numTries=ntries)<br>>>>> /usr/local/mpich2/bin/mpd 231 run<br>>>>> rhsHandler= self.handle_rhs_input)<br>>>>> /usr/local/mpich2/bin/mpd 1344 ?
<br>>>>> mpd.run()<br>>>>><br>>>>> Thanks in advance.<br>>>>> Tiep<br>>>>><br>>>><br>>>><br>>><br>><br></blockquote></div><br>