[mpich-discuss] problems with mpdboot

Pavan Balaji balaji at mcs.anl.gov
Tue Apr 7 10:11:36 CDT 2009


Can you try mpdcheck to make sure there are no network infrastructure 
issues (e.g., firewalls or errors in /etc/hosts)?

Another quick check is to make sure each host can ssh to another host 
with the name given in the host file. For example, try:

  $ ssh c4labpc12.csee.usf.edu -t "ssh c4labpc19.csee.usf.edu hostname"

  -- Pavan

bjday wrote:
> Pavan,
> 
> Yes the names returned by "hostname" and the names in mpd.hosts are the 
> fully qualified names.
> 
> Thank you,
> Brian
> 
> 
> Pavan Balaji wrote:
>>
>> Check if your host file contains the same name as what is returned by 
>> the "hostname" command (e.g., "foo" is different from 
>> "foo.domain.edu"). Otherwise, mpd can't find the local hostname in 
>> your host file.
>>
>>  -- Pavan
>>
>> bjday wrote:
>>> Hello MPICH2 Gurus
>>>
>>> I am installing MPICH2 on some lab computers at the request of a 
>>> professor.  I have ran into a during testing.  When i run mpdboot I 
>>> receive this error
>>>
>>> mpdboot -n 2 -f mpd.hosts -v -d
>>> debug: starting
>>> running mpdallexit on c4labpc19.csee.usf.edu
>>> LAUNCHED mpd on c4labpc19.csee.usf.edu  via
>>> debug: launch cmd= /usr/local/mpich2/bin/mpd.py   --ncpus=1 -e -d
>>> debug: mpd on c4labpc19.csee.usf.edu  on port 37116
>>> RUNNING: mpd on c4labpc19.csee.usf.edu
>>> debug: info for running mpd: {'ncpus': 1, 'list_port': 37116, 
>>> 'entry_port': '', 'host': 'c4labpc19.csee.usf.edu', 'entry_host': '', 
>>> 'ifhn': ''}
>>> LAUNCHED mpd on c4labpc12.csee.usf.edu  via  c4labpc19.csee.usf.edu
>>> debug: launch cmd= ssh -x -n -q c4labpc12.csee.usf.edu 
>>> '/usr/local/mpich2/bin/mpd.py  -h c4labpc19.csee.usf.edu -p 37116  
>>> --ncpus=1 -e -d'
>>> debug: mpd on c4labpc12.csee.usf.edu  on port no_port
>>> mpdboot_c4labpc19.csee.usf.edu (handle_mpd_output 406): from mpd on 
>>> c4labpc12.csee.usf.edu, invalid port info:
>>> no_port
>>>
>>> I have seen this in the forums but there was not a resolution 
>>> posted.  I have gone through the trouble shooting in the install 
>>> guide and i can complete until step 7 where mpdboot is used..  I can 
>>> start mpd on the master, get the port, then connect the slave 
>>> computers by specifying the master name and port number.  Any ideas 
>>> why pc12 is reporting no port?
>>>
>>> Thank you,
>>> Brian
>>
> 

-- 
Pavan Balaji
http://www.mcs.anl.gov/~balaji


More information about the mpich-discuss mailing list