[mpich-discuss] mpdboot hangs, but ...

Rajeev Thakur thakur at mcs.anl.gov
Sun Dec 6 08:53:19 CST 2009


Are you running as root? Can you try running it as regular user first?

Rajeev 

> -----Original Message-----
> From: mpich-discuss-bounces at mcs.anl.gov 
> [mailto:mpich-discuss-bounces at mcs.anl.gov] On Behalf Of 
> Benjamin Svetitsky
> Sent: Sunday, December 06, 2009 8:02 AM
> To: mpich-discuss at mcs.anl.gov
> Subject: Re: [mpich-discuss] mpdboot hangs, but ...
> 
> Hi Dave,
> 
> Thanks for the quick work.  But the problem is still there.  
> I downloaded the file and put it where you said; then I did:
>   configure
>   make
>   make install
> and verified that the new copy of mpd.py is in /usr/local/bin.  But
>   mpdboot -n 4 -f /root/mpd.hosts
> still doesn't exit after starting up the daemons.
> 
> 		-Ben
> 
> Dave Goodell wrote:
> > This has been fixed in the trunk.  Anyone who needs a fix 
> in the short 
> > term should be able to download the following copy of 
> mpd.py and drop 
> > it into src/pm/mpd/ in their MPICH2 source tree (and then 
> re-install MPICH2):
> > 
> > 
> https://trac.mcs.anl.gov/projects/mpich2/export/5923/mpich2/trunk/src/
> > pm/mpd/mpd.py
> > 
> > 
> > -Dave
> > 
> > On Dec 4, 2009, at 9:28 AM, Dave Goodell wrote:
> > 
> >> Hi Ben,
> >>
> >> This looks very similar to ticket #963: 
> >> https://trac.mcs.anl.gov/projects/mpich2/ticket/963
> >>
> >> Please feel free to add yourself to the CC list if you 
> would like to 
> >> receive progress updates.  Thanks for letting us know that you are 
> >> having trouble.
> >>
> >> Thinking about it this morning, I just had an idea of what 
> might be 
> >> going on. I'll spend some time on it today and see if I 
> can reproduce 
> >> it and work up a fix.
> >>
> >> In the mean time, you can either use the hydra process 
> manager (built 
> >> by default as "mpiexec.hydra") or copy the mpd.py script 
> from 1.1.1p1 
> >> as a workaround.
> >>
> >> -Dave
> >>
> >> On Dec 4, 2009, at 5:38 AM, Benjamin Svetitsky wrote:
> >>
> >>> Hello everybody,
> >>>
> >>> I just upgraded from mpich2-1.0.8 to mpich2-1.2.1.  When I run
> >>>
> >>>     mpdboot -n 4 -f /root/mpd.hosts
> >>>
> >>> as root, the command just hangs until I hit ^C.  Nonetheless, it 
> >>> starts the daemons successfully and I can run MPI jobs as 
> usual (so 
> >>> far).  A subsequent mpdallexit kills the saemons without complaint
> >>>
> >>> Details:
> >>>
> >>> I am running four Intel quad-cores under CentOS:
> >>> Linux version 2.6.18-164.6.1.el5.centos.plus The file 
> >>> /root/mpd.hosts contains:
> >>> --
> >>> nodeA
> >>> nodeB
> >>> nodeC
> >>> nodeD
> >>> --
> >>> and I executed mpdboot on nodeC.
> >>> I compiled the MPICH2 source without any config options.
> >>> After mpdboot hangs for several minutes and I hit ^C, it responds:
> >>> --
> >>> Traceback (most recent call last):
> >>> File "/usr/local/bin/mpdboot", line 476, in ?
> >>>   mpdboot()
> >>> File "/usr/local/bin/mpdboot", line 347, in mpdboot
> >>>   handle_mpd_output(fd,fd2idx,hostsAndInfo)
> >>> File "/usr/local/bin/mpdboot", line 385, in handle_mpd_output
> >>>   for line in fd.readlines():    # handle output from shells that 
> >>> echo stuff
> >>> KeyboardInterrupt
> >>> --
> >>> which may be irrelevant.
> >>>
> >>> Thanks,
> >>>     Ben
> >>> -- 
> >>> Prof. Benjamin Svetitsky         Phone:            +972-3-640 8870
> >>> School of Physics and Astronomy  Fax:              +972-3-640 7932
> >>> Tel Aviv University              E-mail:      bqs at julian.tau.ac.il
> >>> 69978 Tel Aviv, Israel           WWW: http://julian.tau.ac.il/~bqs
> >>> _______________________________________________
> >>> mpich-discuss mailing list
> >>> mpich-discuss at mcs.anl.gov
> >>> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
> >>
> >> _______________________________________________
> >> mpich-discuss mailing list
> >> mpich-discuss at mcs.anl.gov
> >> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
> > 
> > _______________________________________________
> > mpich-discuss mailing list
> > mpich-discuss at mcs.anl.gov
> > https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
> 
> -- 
> Prof. Benjamin Svetitsky         Phone:            +972-3-640 8870
> School of Physics and Astronomy  Fax:              +972-3-640 7932
> Tel Aviv University              E-mail:      bqs at julian.tau.ac.il
> 69978 Tel Aviv, Israel           WWW: http://julian.tau.ac.il/~bqs
> _______________________________________________
> mpich-discuss mailing list
> mpich-discuss at mcs.anl.gov
> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
> 



More information about the mpich-discuss mailing list