[mpich-discuss] mpdboot hangs, but ...
Dave Goodell
goodell at mcs.anl.gov
Fri Dec 4 14:37:30 CST 2009
This has been fixed in the trunk. Anyone who needs a fix in the short
term should be able to download the following copy of mpd.py and drop
it into src/pm/mpd/ in their MPICH2 source tree (and then re-install
MPICH2):
https://trac.mcs.anl.gov/projects/mpich2/export/5923/mpich2/trunk/src/pm/mpd/mpd.py
-Dave
On Dec 4, 2009, at 9:28 AM, Dave Goodell wrote:
> Hi Ben,
>
> This looks very similar to ticket #963: https://trac.mcs.anl.gov/projects/mpich2/ticket/963
>
> Please feel free to add yourself to the CC list if you would like to
> receive progress updates. Thanks for letting us know that you are
> having trouble.
>
> Thinking about it this morning, I just had an idea of what might be
> going on. I'll spend some time on it today and see if I can
> reproduce it and work up a fix.
>
> In the mean time, you can either use the hydra process manager
> (built by default as "mpiexec.hydra") or copy the mpd.py script from
> 1.1.1p1 as a workaround.
>
> -Dave
>
> On Dec 4, 2009, at 5:38 AM, Benjamin Svetitsky wrote:
>
>> Hello everybody,
>>
>> I just upgraded from mpich2-1.0.8 to mpich2-1.2.1. When I run
>>
>> mpdboot -n 4 -f /root/mpd.hosts
>>
>> as root, the command just hangs until I hit ^C. Nonetheless, it
>> starts the daemons successfully and I can run MPI jobs as usual (so
>> far). A subsequent mpdallexit kills the saemons without complaint
>>
>> Details:
>>
>> I am running four Intel quad-cores under CentOS:
>> Linux version 2.6.18-164.6.1.el5.centos.plus
>> The file /root/mpd.hosts contains:
>> --
>> nodeA
>> nodeB
>> nodeC
>> nodeD
>> --
>> and I executed mpdboot on nodeC.
>> I compiled the MPICH2 source without any config options.
>> After mpdboot hangs for several minutes and I hit ^C, it responds:
>> --
>> Traceback (most recent call last):
>> File "/usr/local/bin/mpdboot", line 476, in ?
>> mpdboot()
>> File "/usr/local/bin/mpdboot", line 347, in mpdboot
>> handle_mpd_output(fd,fd2idx,hostsAndInfo)
>> File "/usr/local/bin/mpdboot", line 385, in handle_mpd_output
>> for line in fd.readlines(): # handle output from shells that
>> echo stuff
>> KeyboardInterrupt
>> --
>> which may be irrelevant.
>>
>> Thanks,
>> Ben
>> --
>> Prof. Benjamin Svetitsky Phone: +972-3-640 8870
>> School of Physics and Astronomy Fax: +972-3-640 7932
>> Tel Aviv University E-mail: bqs at julian.tau.ac.il
>> 69978 Tel Aviv, Israel WWW: http://julian.tau.ac.il/~bqs
>> _______________________________________________
>> mpich-discuss mailing list
>> mpich-discuss at mcs.anl.gov
>> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
>
> _______________________________________________
> mpich-discuss mailing list
> mpich-discuss at mcs.anl.gov
> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
More information about the mpich-discuss
mailing list