[mpich-discuss] mpdboot hangs, but ...
Rajeev Thakur
thakur at mcs.anl.gov
Sun Dec 6 08:53:19 CST 2009
Are you running as root? Can you try running it as regular user first?
Rajeev
> -----Original Message-----
> From: mpich-discuss-bounces at mcs.anl.gov
> [mailto:mpich-discuss-bounces at mcs.anl.gov] On Behalf Of
> Benjamin Svetitsky
> Sent: Sunday, December 06, 2009 8:02 AM
> To: mpich-discuss at mcs.anl.gov
> Subject: Re: [mpich-discuss] mpdboot hangs, but ...
>
> Hi Dave,
>
> Thanks for the quick work. But the problem is still there.
> I downloaded the file and put it where you said; then I did:
> configure
> make
> make install
> and verified that the new copy of mpd.py is in /usr/local/bin. But
> mpdboot -n 4 -f /root/mpd.hosts
> still doesn't exit after starting up the daemons.
>
> -Ben
>
> Dave Goodell wrote:
> > This has been fixed in the trunk. Anyone who needs a fix
> in the short
> > term should be able to download the following copy of
> mpd.py and drop
> > it into src/pm/mpd/ in their MPICH2 source tree (and then
> re-install MPICH2):
> >
> >
> https://trac.mcs.anl.gov/projects/mpich2/export/5923/mpich2/trunk/src/
> > pm/mpd/mpd.py
> >
> >
> > -Dave
> >
> > On Dec 4, 2009, at 9:28 AM, Dave Goodell wrote:
> >
> >> Hi Ben,
> >>
> >> This looks very similar to ticket #963:
> >> https://trac.mcs.anl.gov/projects/mpich2/ticket/963
> >>
> >> Please feel free to add yourself to the CC list if you
> would like to
> >> receive progress updates. Thanks for letting us know that you are
> >> having trouble.
> >>
> >> Thinking about it this morning, I just had an idea of what
> might be
> >> going on. I'll spend some time on it today and see if I
> can reproduce
> >> it and work up a fix.
> >>
> >> In the mean time, you can either use the hydra process
> manager (built
> >> by default as "mpiexec.hydra") or copy the mpd.py script
> from 1.1.1p1
> >> as a workaround.
> >>
> >> -Dave
> >>
> >> On Dec 4, 2009, at 5:38 AM, Benjamin Svetitsky wrote:
> >>
> >>> Hello everybody,
> >>>
> >>> I just upgraded from mpich2-1.0.8 to mpich2-1.2.1. When I run
> >>>
> >>> mpdboot -n 4 -f /root/mpd.hosts
> >>>
> >>> as root, the command just hangs until I hit ^C. Nonetheless, it
> >>> starts the daemons successfully and I can run MPI jobs as
> usual (so
> >>> far). A subsequent mpdallexit kills the saemons without complaint
> >>>
> >>> Details:
> >>>
> >>> I am running four Intel quad-cores under CentOS:
> >>> Linux version 2.6.18-164.6.1.el5.centos.plus The file
> >>> /root/mpd.hosts contains:
> >>> --
> >>> nodeA
> >>> nodeB
> >>> nodeC
> >>> nodeD
> >>> --
> >>> and I executed mpdboot on nodeC.
> >>> I compiled the MPICH2 source without any config options.
> >>> After mpdboot hangs for several minutes and I hit ^C, it responds:
> >>> --
> >>> Traceback (most recent call last):
> >>> File "/usr/local/bin/mpdboot", line 476, in ?
> >>> mpdboot()
> >>> File "/usr/local/bin/mpdboot", line 347, in mpdboot
> >>> handle_mpd_output(fd,fd2idx,hostsAndInfo)
> >>> File "/usr/local/bin/mpdboot", line 385, in handle_mpd_output
> >>> for line in fd.readlines(): # handle output from shells that
> >>> echo stuff
> >>> KeyboardInterrupt
> >>> --
> >>> which may be irrelevant.
> >>>
> >>> Thanks,
> >>> Ben
> >>> --
> >>> Prof. Benjamin Svetitsky Phone: +972-3-640 8870
> >>> School of Physics and Astronomy Fax: +972-3-640 7932
> >>> Tel Aviv University E-mail: bqs at julian.tau.ac.il
> >>> 69978 Tel Aviv, Israel WWW: http://julian.tau.ac.il/~bqs
> >>> _______________________________________________
> >>> mpich-discuss mailing list
> >>> mpich-discuss at mcs.anl.gov
> >>> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
> >>
> >> _______________________________________________
> >> mpich-discuss mailing list
> >> mpich-discuss at mcs.anl.gov
> >> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
> >
> > _______________________________________________
> > mpich-discuss mailing list
> > mpich-discuss at mcs.anl.gov
> > https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
>
> --
> Prof. Benjamin Svetitsky Phone: +972-3-640 8870
> School of Physics and Astronomy Fax: +972-3-640 7932
> Tel Aviv University E-mail: bqs at julian.tau.ac.il
> 69978 Tel Aviv, Israel WWW: http://julian.tau.ac.il/~bqs
> _______________________________________________
> mpich-discuss mailing list
> mpich-discuss at mcs.anl.gov
> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
>
More information about the mpich-discuss
mailing list