[mpich-discuss] mpdboot fails

Rajeev Thakur thakur at mcs.anl.gov
Wed Aug 27 11:23:16 CDT 2008


Did you run all the mpdcheck tests as described in Appendix A.2 of the
installation guide?

Rajeev 

> -----Original Message-----
> From: owner-mpich-discuss at mcs.anl.gov 
> [mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of Robert Kubrick
> Sent: Wednesday, August 27, 2008 10:10 AM
> To: mpich-discuss at mcs.anl.gov
> Subject: Re: [mpich-discuss] mpdboot fails
> 
> I tried mpdcheck -s/-c and it works between the 2 machines both ways:
> 
> iMac:$ mpdcheck -s
> server listening at INADDR_ANY on: iMac.local 65357
> server has conn on <socket._socketobject object at 0x8c030> from  
> ('192.168.2.2', 43331)
> server successfully recvd msg from client: hello_from_client_to_server
> 
> wlaptop:$ mpdcheck -c iMac 65357
> client successfully recvd ack from server: ack_from_server_to_client
> 
> I run the mpd ring manually:
> 
> On iMac:
> 
> iMac:$ mpd&
> [1] 13755
> iMac:$ mpdtrace -l
> iMac.local_65501 (192.168.2.1)
> 
> On wlaptop:
> 
> wlaptop:$ mpd -h iMac -p 65501&
> 
> Then I run a test on the local machine:
> 
> iMac:$ mpiexec -n 2 /bin/hostname
> iMac.local
> wlaptop
> 
> So far so good. Now I try to run 4 hostname processes, this 
> is when I  
> start having problems:
> 
> iMac:$ mpiexec -n 4 hostname
> iMac.local_mpdman_2: conn error in connect_lhs: Operation timed out
> 
> The command hangs for a while then it prints the timeout message.  
> Lets take a look at the mpd processes:
> 
> $ ps -A|fgrep mpd
> 13755  p1  S      0:00.28 python2.3 /opt/local/mpich2/bin/mpd
> 13805  p1  S      0:00.01 python2.3 /opt/local/mpich2/bin/mpd
> 13806  p1  S      0:00.01 python2.3 /opt/local/mpich2/bin/mpd
> iMac:$ mpdtrace -l
> iMac.local_65501 (192.168.2.1)
> wlaptop_53313 (127.0.1.1)
> 
> I have to manually kill the extra mpd on the local machine.
> 
> On Aug 26, 2008, at 7:11 PM, Rajeev Thakur wrote:
> 
> > -mpd just specifies the path for mpd on remote hosts (the 
> same path  
> > for all
> > remote hosts). If it still doesn't work, there may be some problem  
> > with the
> > networking configuration on the two machines. To debug the 
> problem,  
> > you can
> > use the mpdcheck utility as described in the installation guide  
> > (all steps).
> >
> > Rajeev
> >
> >
> >> -----Original Message-----
> >> From: owner-mpich-discuss at mcs.anl.gov
> >> [mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of 
> Robert Kubrick
> >> Sent: Tuesday, August 26, 2008 6:05 PM
> >> To: mpich-discuss at mcs.anl.gov
> >> Subject: Re: [mpich-discuss] mpdboot fails
> >>
> >> I'm not clear on how --mpd works: I want to specify two
> >> different paths, one for the local machine and one for the
> >> remote host (or possibly a number of different remote hosts).
> >> Here it looks like I can only set one path for all the hosts:
> >>
> >> $ mpdboot --verbose --totalnum=2
> >> --mpd=/usr/local/mpich2/bin/mpd --debug
> >> debug: starting
> >> running mpdallexit on iMac.local
> >> LAUNCHED mpd on iMac.local  via
> >> debug: launch cmd= /usr/local/mpich2/bin/mpd   --ncpus=1 -e -d
> >> debug: mpd on iMac.local  on port no_port mpdboot_iMac.local
> >> (handle_mpd_output 406): from mpd on iMac.local, invalid port info:
> >> no_port
> >>
> >>
> >> $ mpdboot --verbose --totalnum=2 --debug
> >> debug: starting
> >> running mpdallexit on iMac.local
> >> LAUNCHED mpd on iMac.local  via
> >> debug: launch cmd= /opt/local/mpich2/bin/mpd.py   --ncpus=1 -e -d
> >> debug: mpd on iMac.local  on port 62629
> >> RUNNING: mpd on iMac.local
> >> debug: info for running mpd: {'ncpus': 1, 'list_port': 62629,
> >> 'entry_port': '', 'host': 'iMac.local', 'entry_host': '',
> >> 'ifhn': ''} LAUNCHED mpd on wlaptop  via  iMac.local
> >> debug: launch cmd= ssh -x -n -q wlaptop
> >> '/opt/local/mpich2/bin/ mpd.py  -h iMac.local -p 62629
> >> --ncpus=1 -e -d'
> >> debug: mpd on wlaptop  on port no_port
> >> mpdboot_iMac.local (handle_mpd_output 406): from mpd on
> >> wlaptop, invalid port info:
> >> no_port
> >>
> >>
> >>
> >>
> >> On Aug 25, 2008, at 2:20 PM, Rajeev Thakur wrote:
> >>
> >>>> The path to mpd on 'wlaptop' is actually a different one than on
> >>>> iMac.local. How can I specify a different remote path for mpd?
> >>>
> >>> With the --mpd option to mpdboot.
> >>>
> >>> Rajeev
> >>>
> >>>> -----Original Message-----
> >>>> From: owner-mpich-discuss at mcs.anl.gov
> >>>> [mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of
> >> Robert Kubrick
> >>>> Sent: Sunday, August 24, 2008 5:19 PM
> >>>> To: mpich-discuss at mcs.anl.gov
> >>>> Subject: [mpich-discuss] mpdboot fails
> >>>>
> >>>> mpdboot fails when I try to run with this command:
> >>>>
> >>>> # mpdboot --totalnum=2 --verbose -d               debug: starting
> >>>> running mpdallexit on iMac.local
> >>>> LAUNCHED mpd on iMac.local  via
> >>>> debug: launch cmd= /opt/local/mpich2/bin/mpd.py   --ncpus=1 -e -d
> >>>> debug: mpd on iMac.local  on port 61785
> >>>> RUNNING: mpd on iMac.local
> >>>> debug: info for running mpd: {'ncpus': 1, 'list_port': 61785,
> >>>> 'entry_port': '', 'host': 'iMac.local', 'entry_host': '',
> >> 'ifhn': ''}
> >>>> LAUNCHED mpd on wlaptop  via  iMac.local
> >>>> debug: launch cmd= ssh -x -n -q wlaptop '/opt/local/mpich2/bin/
> >>>> mpd.py  -h iMac.local -p 61785  --ncpus=1 -e -d'
> >>>> debug: mpd on wlaptop  on port no_port mpdboot_iMac.local
> >>>> (handle_mpd_output 406): from mpd on wlaptop, invalid port info:
> >>>> no_port
> >>>>
> >>>>
> >>>> The path to mpd on 'wlaptop' is actually a different one than on
> >>>> iMac.local. How can I specify a different remote path for mpd?
> >>>>
> >>>>
> >>>
> >>
> >>
> >
> 
> 




More information about the mpich-discuss mailing list