[mpich-discuss] mpdboot fails

Ralph Butler rbutler at mtsu.edu
Wed Aug 27 16:22:51 CDT 2008


It looks like this may be getting pretty close.  But I notice in the  
output from mpdcheck -v:
         gethostname gives  iMac.local
         getfqdn gives  iMac.local
But that the name iMac.local is not used in either /etc/hosts file.   
That name almost certainly
needs to be resolvable, don't you think.  Because that host will  
identify itself to the other by
that name.
--ralph

On WedAug 27, at Wed Aug 27 2:17PM, Robert Kubrick wrote:

> This is what I have:
>
> iMac
> ====
>
> $ mpdcheck -v
> obtaining hostname via gethostname and getfqdn
> gethostname gives  iMac.local
> getfqdn gives  iMac.local
> checking out unqualified hostname; make sure is not "localhost", etc.
> checking out qualified hostname; make sure is not "localhost", etc.
> obtain IP addrs via qualified and unqualified hostnames;  make sure  
> other than 127.0.0.1
> gethostbyname_ex:  ('imac.local', [], ['192.168.2.1',  
> '169.254.51.159'])
> *** gethostbyaddr failed for this hosts's IP 192.168.2.1
> gethostbyname_ex:  ('imac.local', [], ['192.168.2.1',  
> '169.254.51.159'])
> *** gethostbyaddr failed for this hosts's IP 192.168.2.1
> checking that IP addrs resolve to same host
> now do some gethostbyaddr and gethostbyname_ex for machines in hosts  
> file
> iMac:$ mpdcheck -l
> *** gethostbyaddr failed for this hosts's IP 192.168.2.1
> *** gethostbyaddr failed for this hosts's IP 192.168.2.1
> iMac:$ cat /etc/hosts
> ##
> # Host Database
> #
> # localhost is used to configure the loopback interface
> # when the system is booting.  Do not change this entry.
> ##
> 127.0.0.1       localhost
> 255.255.255.255 broadcasthost
> ::1             localhost
>
> 192.168.2.1     iMac imac
> 192.168.2.2     wlaptop
>
>
> wlaptop
> ======
>
> $ mpdcheck -v
> obtaining hostname via gethostname and getfqdn
> gethostname gives  wlaptop
> getfqdn gives  wlaptop
> checking out unqualified hostname; make sure is not "localhost", etc.
> checking out qualified hostname; make sure is not "localhost", etc.
> obtain IP addrs via qualified and unqualified hostnames;  make sure  
> other than 127.0.0.1
> gethostbyname_ex:  ('wlaptop', [], ['127.0.1.1', '192.168.2.2'])
> *** first ipaddr for this host (via wlaptop) is: 127.0.1.1
> gethostbyname_ex:  ('wlaptop', [], ['127.0.1.1', '192.168.2.2'])
> checking that IP addrs resolve to same host
> now do some gethostbyaddr and gethostbyname_ex for machines in hosts  
> file
> wlaptop:$ mpdcheck -l
>
>    **********
>    Your unqualified hostname resolves to 127.0.0.1, which is
>    the IP address reserved for localhost. This likely means that
>    you have a line similar to this one in your /etc/hosts file:
>    127.0.0.1   $uqhn
>    This should perhaps be changed to the following:
>    127.0.0.1   localhost.localdomain localhost
>    **********
>
> wlaptop:$ cat /etc/hosts
> 127.0.0.1       localhost
> 127.0.1.1       wlaptop
>
> 192.168.2.1     imac iMac
> 192.168.2.2     wlaptop
>
> # The following lines are desirable for IPv6 capable hosts
> ::1     ip6-localhost ip6-loopback
> fe00::0 ip6-localnet
> ff00::0 ip6-mcastprefix
> ff02::1 ip6-allnodes
> ff02::2 ip6-allrouters
> ff02::3 ip6-allhosts
>
>
>
> On Aug 27, 2008, at 12:23 PM, Rajeev Thakur wrote:
>
>> Did you run all the mpdcheck tests as described in Appendix A.2 of  
>> the
>> installation guide?
>>
>> Rajeev
>>
>>> -----Original Message-----
>>> From: owner-mpich-discuss at mcs.anl.gov
>>> [mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of Robert Kubrick
>>> Sent: Wednesday, August 27, 2008 10:10 AM
>>> To: mpich-discuss at mcs.anl.gov
>>> Subject: Re: [mpich-discuss] mpdboot fails
>>>
>>> I tried mpdcheck -s/-c and it works between the 2 machines both  
>>> ways:
>>>
>>> iMac:$ mpdcheck -s
>>> server listening at INADDR_ANY on: iMac.local 65357
>>> server has conn on <socket._socketobject object at 0x8c030> from
>>> ('192.168.2.2', 43331)
>>> server successfully recvd msg from client:  
>>> hello_from_client_to_server
>>>
>>> wlaptop:$ mpdcheck -c iMac 65357
>>> client successfully recvd ack from server: ack_from_server_to_client
>>>
>>> I run the mpd ring manually:
>>>
>>> On iMac:
>>>
>>> iMac:$ mpd&
>>> [1] 13755
>>> iMac:$ mpdtrace -l
>>> iMac.local_65501 (192.168.2.1)
>>>
>>> On wlaptop:
>>>
>>> wlaptop:$ mpd -h iMac -p 65501&
>>>
>>> Then I run a test on the local machine:
>>>
>>> iMac:$ mpiexec -n 2 /bin/hostname
>>> iMac.local
>>> wlaptop
>>>
>>> So far so good. Now I try to run 4 hostname processes, this
>>> is when I
>>> start having problems:
>>>
>>> iMac:$ mpiexec -n 4 hostname
>>> iMac.local_mpdman_2: conn error in connect_lhs: Operation timed out
>>>
>>> The command hangs for a while then it prints the timeout message.
>>> Lets take a look at the mpd processes:
>>>
>>> $ ps -A|fgrep mpd
>>> 13755  p1  S      0:00.28 python2.3 /opt/local/mpich2/bin/mpd
>>> 13805  p1  S      0:00.01 python2.3 /opt/local/mpich2/bin/mpd
>>> 13806  p1  S      0:00.01 python2.3 /opt/local/mpich2/bin/mpd
>>> iMac:$ mpdtrace -l
>>> iMac.local_65501 (192.168.2.1)
>>> wlaptop_53313 (127.0.1.1)
>>>
>>> I have to manually kill the extra mpd on the local machine.
>>>
>>> On Aug 26, 2008, at 7:11 PM, Rajeev Thakur wrote:
>>>
>>>> -mpd just specifies the path for mpd on remote hosts (the
>>> same path
>>>> for all
>>>> remote hosts). If it still doesn't work, there may be some problem
>>>> with the
>>>> networking configuration on the two machines. To debug the
>>> problem,
>>>> you can
>>>> use the mpdcheck utility as described in the installation guide
>>>> (all steps).
>>>>
>>>> Rajeev
>>>>
>>>>
>>>>> -----Original Message-----
>>>>> From: owner-mpich-discuss at mcs.anl.gov
>>>>> [mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of
>>> Robert Kubrick
>>>>> Sent: Tuesday, August 26, 2008 6:05 PM
>>>>> To: mpich-discuss at mcs.anl.gov
>>>>> Subject: Re: [mpich-discuss] mpdboot fails
>>>>>
>>>>> I'm not clear on how --mpd works: I want to specify two
>>>>> different paths, one for the local machine and one for the
>>>>> remote host (or possibly a number of different remote hosts).
>>>>> Here it looks like I can only set one path for all the hosts:
>>>>>
>>>>> $ mpdboot --verbose --totalnum=2
>>>>> --mpd=/usr/local/mpich2/bin/mpd --debug
>>>>> debug: starting
>>>>> running mpdallexit on iMac.local
>>>>> LAUNCHED mpd on iMac.local  via
>>>>> debug: launch cmd= /usr/local/mpich2/bin/mpd   --ncpus=1 -e -d
>>>>> debug: mpd on iMac.local  on port no_port mpdboot_iMac.local
>>>>> (handle_mpd_output 406): from mpd on iMac.local, invalid port  
>>>>> info:
>>>>> no_port
>>>>>
>>>>>
>>>>> $ mpdboot --verbose --totalnum=2 --debug
>>>>> debug: starting
>>>>> running mpdallexit on iMac.local
>>>>> LAUNCHED mpd on iMac.local  via
>>>>> debug: launch cmd= /opt/local/mpich2/bin/mpd.py   --ncpus=1 -e -d
>>>>> debug: mpd on iMac.local  on port 62629
>>>>> RUNNING: mpd on iMac.local
>>>>> debug: info for running mpd: {'ncpus': 1, 'list_port': 62629,
>>>>> 'entry_port': '', 'host': 'iMac.local', 'entry_host': '',
>>>>> 'ifhn': ''} LAUNCHED mpd on wlaptop  via  iMac.local
>>>>> debug: launch cmd= ssh -x -n -q wlaptop
>>>>> '/opt/local/mpich2/bin/ mpd.py  -h iMac.local -p 62629
>>>>> --ncpus=1 -e -d'
>>>>> debug: mpd on wlaptop  on port no_port
>>>>> mpdboot_iMac.local (handle_mpd_output 406): from mpd on
>>>>> wlaptop, invalid port info:
>>>>> no_port
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Aug 25, 2008, at 2:20 PM, Rajeev Thakur wrote:
>>>>>
>>>>>>> The path to mpd on 'wlaptop' is actually a different one than on
>>>>>>> iMac.local. How can I specify a different remote path for mpd?
>>>>>>
>>>>>> With the --mpd option to mpdboot.
>>>>>>
>>>>>> Rajeev
>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: owner-mpich-discuss at mcs.anl.gov
>>>>>>> [mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of
>>>>> Robert Kubrick
>>>>>>> Sent: Sunday, August 24, 2008 5:19 PM
>>>>>>> To: mpich-discuss at mcs.anl.gov
>>>>>>> Subject: [mpich-discuss] mpdboot fails
>>>>>>>
>>>>>>> mpdboot fails when I try to run with this command:
>>>>>>>
>>>>>>> # mpdboot --totalnum=2 --verbose -d               debug:  
>>>>>>> starting
>>>>>>> running mpdallexit on iMac.local
>>>>>>> LAUNCHED mpd on iMac.local  via
>>>>>>> debug: launch cmd= /opt/local/mpich2/bin/mpd.py   --ncpus=1 -e  
>>>>>>> -d
>>>>>>> debug: mpd on iMac.local  on port 61785
>>>>>>> RUNNING: mpd on iMac.local
>>>>>>> debug: info for running mpd: {'ncpus': 1, 'list_port': 61785,
>>>>>>> 'entry_port': '', 'host': 'iMac.local', 'entry_host': '',
>>>>> 'ifhn': ''}
>>>>>>> LAUNCHED mpd on wlaptop  via  iMac.local
>>>>>>> debug: launch cmd= ssh -x -n -q wlaptop '/opt/local/mpich2/bin/
>>>>>>> mpd.py  -h iMac.local -p 61785  --ncpus=1 -e -d'
>>>>>>> debug: mpd on wlaptop  on port no_port mpdboot_iMac.local
>>>>>>> (handle_mpd_output 406): from mpd on wlaptop, invalid port info:
>>>>>>> no_port
>>>>>>>
>>>>>>>
>>>>>>> The path to mpd on 'wlaptop' is actually a different one than on
>>>>>>> iMac.local. How can I specify a different remote path for mpd?
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>
>




More information about the mpich-discuss mailing list