[mpich-discuss] mpdboot hangs, but ...

Benjamin Svetitsky bqs at julian.tau.ac.il
Sun Dec 6 09:19:40 CST 2009


If I run mpdboot as a regular user then it hangs WITHOUT starting up the 
daemons.    -Ben

Rajeev Thakur wrote:
> Are you running as root? Can you try running it as regular user first?
> 
> Rajeev 
> 
>> -----Original Message-----
>> From: mpich-discuss-bounces at mcs.anl.gov 
>> [mailto:mpich-discuss-bounces at mcs.anl.gov] On Behalf Of 
>> Benjamin Svetitsky
>> Sent: Sunday, December 06, 2009 8:02 AM
>> To: mpich-discuss at mcs.anl.gov
>> Subject: Re: [mpich-discuss] mpdboot hangs, but ...
>>
>> Hi Dave,
>>
>> Thanks for the quick work.  But the problem is still there.  
>> I downloaded the file and put it where you said; then I did:
>>   configure
>>   make
>>   make install
>> and verified that the new copy of mpd.py is in /usr/local/bin.  But
>>   mpdboot -n 4 -f /root/mpd.hosts
>> still doesn't exit after starting up the daemons.
>>
>> 		-Ben
>>
>> Dave Goodell wrote:
>>> This has been fixed in the trunk.  Anyone who needs a fix 
>> in the short 
>>> term should be able to download the following copy of 
>> mpd.py and drop 
>>> it into src/pm/mpd/ in their MPICH2 source tree (and then 
>> re-install MPICH2):
>>>
>> https://trac.mcs.anl.gov/projects/mpich2/export/5923/mpich2/trunk/src/
>>> pm/mpd/mpd.py
>>>
>>>
>>> -Dave
>>>
>>> On Dec 4, 2009, at 9:28 AM, Dave Goodell wrote:
>>>
>>>> Hi Ben,
>>>>
>>>> This looks very similar to ticket #963: 
>>>> https://trac.mcs.anl.gov/projects/mpich2/ticket/963
>>>>
>>>> Please feel free to add yourself to the CC list if you 
>> would like to 
>>>> receive progress updates.  Thanks for letting us know that you are 
>>>> having trouble.
>>>>
>>>> Thinking about it this morning, I just had an idea of what 
>> might be 
>>>> going on. I'll spend some time on it today and see if I 
>> can reproduce 
>>>> it and work up a fix.
>>>>
>>>> In the mean time, you can either use the hydra process 
>> manager (built 
>>>> by default as "mpiexec.hydra") or copy the mpd.py script 
>> from 1.1.1p1 
>>>> as a workaround.
>>>>
>>>> -Dave
>>>>
>>>> On Dec 4, 2009, at 5:38 AM, Benjamin Svetitsky wrote:
>>>>
>>>>> Hello everybody,
>>>>>
>>>>> I just upgraded from mpich2-1.0.8 to mpich2-1.2.1.  When I run
>>>>>
>>>>>     mpdboot -n 4 -f /root/mpd.hosts
>>>>>
>>>>> as root, the command just hangs until I hit ^C.  Nonetheless, it 
>>>>> starts the daemons successfully and I can run MPI jobs as 
>> usual (so 
>>>>> far).  A subsequent mpdallexit kills the saemons without complaint
>>>>>
>>>>> Details:
>>>>>
>>>>> I am running four Intel quad-cores under CentOS:
>>>>> Linux version 2.6.18-164.6.1.el5.centos.plus The file 
>>>>> /root/mpd.hosts contains:
>>>>> --
>>>>> nodeA
>>>>> nodeB
>>>>> nodeC
>>>>> nodeD
>>>>> --
>>>>> and I executed mpdboot on nodeC.
>>>>> I compiled the MPICH2 source without any config options.
>>>>> After mpdboot hangs for several minutes and I hit ^C, it responds:
>>>>> --
>>>>> Traceback (most recent call last):
>>>>> File "/usr/local/bin/mpdboot", line 476, in ?
>>>>>   mpdboot()
>>>>> File "/usr/local/bin/mpdboot", line 347, in mpdboot
>>>>>   handle_mpd_output(fd,fd2idx,hostsAndInfo)
>>>>> File "/usr/local/bin/mpdboot", line 385, in handle_mpd_output
>>>>>   for line in fd.readlines():    # handle output from shells that 
>>>>> echo stuff
>>>>> KeyboardInterrupt
>>>>> --
>>>>> which may be irrelevant.
>>>>>
>>>>> Thanks,
>>>>>     Ben
>>>>> -- 
>>>>> Prof. Benjamin Svetitsky         Phone:            +972-3-640 8870
>>>>> School of Physics and Astronomy  Fax:              +972-3-640 7932
>>>>> Tel Aviv University              E-mail:      bqs at julian.tau.ac.il
>>>>> 69978 Tel Aviv, Israel           WWW: http://julian.tau.ac.il/~bqs
>>>>> _______________________________________________
>>>>> mpich-discuss mailing list
>>>>> mpich-discuss at mcs.anl.gov
>>>>> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
>>>> _______________________________________________
>>>> mpich-discuss mailing list
>>>> mpich-discuss at mcs.anl.gov
>>>> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
>>> _______________________________________________
>>> mpich-discuss mailing list
>>> mpich-discuss at mcs.anl.gov
>>> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
>> -- 
>> Prof. Benjamin Svetitsky         Phone:            +972-3-640 8870
>> School of Physics and Astronomy  Fax:              +972-3-640 7932
>> Tel Aviv University              E-mail:      bqs at julian.tau.ac.il
>> 69978 Tel Aviv, Israel           WWW: http://julian.tau.ac.il/~bqs
>> _______________________________________________
>> mpich-discuss mailing list
>> mpich-discuss at mcs.anl.gov
>> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
>>
> 
> _______________________________________________
> mpich-discuss mailing list
> mpich-discuss at mcs.anl.gov
> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss

-- 
Prof. Benjamin Svetitsky         Phone:            +972-3-640 8870
School of Physics and Astronomy  Fax:              +972-3-640 7932
Tel Aviv University              E-mail:      bqs at julian.tau.ac.il
69978 Tel Aviv, Israel           WWW: http://julian.tau.ac.il/~bqs


More information about the mpich-discuss mailing list