[mpich-discuss] mpdboot hangs, but ...

Dave Goodell goodell at mcs.anl.gov
Fri Dec 4 14:37:30 CST 2009


This has been fixed in the trunk.  Anyone who needs a fix in the short  
term should be able to download the following copy of mpd.py and drop  
it into src/pm/mpd/ in their MPICH2 source tree (and then re-install  
MPICH2):

https://trac.mcs.anl.gov/projects/mpich2/export/5923/mpich2/trunk/src/pm/mpd/mpd.py

-Dave

On Dec 4, 2009, at 9:28 AM, Dave Goodell wrote:

> Hi Ben,
>
> This looks very similar to ticket #963: https://trac.mcs.anl.gov/projects/mpich2/ticket/963
>
> Please feel free to add yourself to the CC list if you would like to  
> receive progress updates.  Thanks for letting us know that you are  
> having trouble.
>
> Thinking about it this morning, I just had an idea of what might be  
> going on. I'll spend some time on it today and see if I can  
> reproduce it and work up a fix.
>
> In the mean time, you can either use the hydra process manager  
> (built by default as "mpiexec.hydra") or copy the mpd.py script from  
> 1.1.1p1 as a workaround.
>
> -Dave
>
> On Dec 4, 2009, at 5:38 AM, Benjamin Svetitsky wrote:
>
>> Hello everybody,
>>
>> I just upgraded from mpich2-1.0.8 to mpich2-1.2.1.  When I run
>>
>> 	mpdboot -n 4 -f /root/mpd.hosts
>>
>> as root, the command just hangs until I hit ^C.  Nonetheless, it  
>> starts the daemons successfully and I can run MPI jobs as usual (so  
>> far).  A subsequent mpdallexit kills the saemons without complaint
>>
>> Details:
>>
>> I am running four Intel quad-cores under CentOS:
>> Linux version 2.6.18-164.6.1.el5.centos.plus
>> The file /root/mpd.hosts contains:
>> --
>> nodeA
>> nodeB
>> nodeC
>> nodeD
>> --
>> and I executed mpdboot on nodeC.
>> I compiled the MPICH2 source without any config options.
>> After mpdboot hangs for several minutes and I hit ^C, it responds:
>> --
>> Traceback (most recent call last):
>> File "/usr/local/bin/mpdboot", line 476, in ?
>>   mpdboot()
>> File "/usr/local/bin/mpdboot", line 347, in mpdboot
>>   handle_mpd_output(fd,fd2idx,hostsAndInfo)
>> File "/usr/local/bin/mpdboot", line 385, in handle_mpd_output
>>   for line in fd.readlines():    # handle output from shells that  
>> echo stuff
>> KeyboardInterrupt
>> --
>> which may be irrelevant.
>>
>> Thanks,
>> 	Ben
>> -- 
>> Prof. Benjamin Svetitsky         Phone:            +972-3-640 8870
>> School of Physics and Astronomy  Fax:              +972-3-640 7932
>> Tel Aviv University              E-mail:      bqs at julian.tau.ac.il
>> 69978 Tel Aviv, Israel           WWW: http://julian.tau.ac.il/~bqs
>> _______________________________________________
>> mpich-discuss mailing list
>> mpich-discuss at mcs.anl.gov
>> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
>
> _______________________________________________
> mpich-discuss mailing list
> mpich-discuss at mcs.anl.gov
> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss



More information about the mpich-discuss mailing list