[mpich-discuss] Hydra unable to execute jobs that use more than one node(host) under PBS RMK

Mário Costa mario.silva.costa at gmail.com
Tue Jan 26 10:02:34 CST 2010


Hi Pavan,

2010/1/26 Pavan Balaji <balaji at mcs.anl.gov>:
> Mario,
>
> This is good information. Yes, it shouldn't matter which process does
> the ssh, and yes it is possible that closing stdin is the culprit. Would
> you be willing to try out the trunk version of Hydra, which has a bunch
> of fixes in this area?

Sure! I will let you know how it goes...

>
> http://www.mcs.anl.gov/research/projects/mpich2/downloads/tarballs/nightly/hydra
>
> Note that the trunk has a few critical bugs that I'm working on right
> now, so these nightly tarballs are only meant for testing, and not for
> production use.
>
>  -- Pavan
>
> On 01/21/2010 09:17 PM, Mário Costa wrote:
>> Hi again,
>>
>> I found out the problem comes up only on some specific ssh versions
>> (in my caseOpenSSH_4.2p1, OpenSSL 0.9.8a 11 Oct 2005), and it depends
>> on the order of the processes in the command.
>>
>> If I test
>>
>> 1. mpiexec.hydra -bootstrap fork -n 1 /bin/true : ssh gorgon002 hostname
>> I've got the problem, it hangs, and reports to stderr
>>
>> bad fd
>> ssh_keysign: no reply
>> key_sign failed
>>
>> After some googling I found this
>> (http://l-sourcemotel.gsfc.nasa.gov/pipermail/test-proj-1-commits/2006-March/000350.html),
>> which looks like the problem I have.
>>
>> 2. mpiexec.hydra -bootstrap fork -n 1 ssh gorgon002 hostname : /bin/true
>>
>> Works fine!
>>
>> Shouldn't it be the same, independently of the order ? Could you be
>> closing the stdin (or changing it) of the second exec before its time
>> ?
>>
>> I've replaced the hostname by sleep 5m to get the opened files via
>> lsof, check the difference
>>
>> 1. mpiexec.hydra -bootstrap fork -n 1 /bin/true : ssh gorgon002 sleep 5m
>>
>> $>ps auxf
>> mjscosta 27653  0.0  0.0  10100  2416 pts/1    Ss   01:00   0:00  |
>>    \_ -bash
>> mjscosta 28825  0.0  0.0   6076   704 pts/1    S+   02:59   0:00  |
>>        \_ mpiexec.hydra -bootstrap fork -n 1 /bin/true : ssh gorgon002
>> sleep 5m
>> mjscosta 28826  0.0  0.0   6220   756 pts/1    S+   02:59   0:00  |
>>            \_ /usr/bin/pmi_proxy --launch-mode 1 --proxy-port
>> gorgon001 49063 --bootstrap fork --proxy-id 0
>> mjscosta 28827  0.0  0.0      0     0 pts/1    Z+   02:59   0:00  |
>>                \_ [true] <defunct>
>> mjscosta 28828  0.0  0.0  24064  2500 pts/1    S+   02:59   0:00  |
>>                \_ ssh gorgon002 sleep 5m
>> mjscosta 28829  0.0  0.0      0     0 pts/1    Z+   02:59   0:00  |
>>                    \_ [ssh-keysign] <defunct>
>>
>> $>lsof 28827
>> COMMAND   PID     USER   FD   TYPE  DEVICE    SIZE      NODE NAME
>> ...
>> ssh     28828 mjscosta    0u  IPv4 1839838               TCP
>> gorgon001.lnec.pt:50989->gorgon002.lnec.pt:ssh (ESTABLISHED) <<
>> ssh     28828 mjscosta    1w  FIFO     0,6           1839832 pipe
>> ssh     28828 mjscosta    2w  FIFO     0,6           1839833 pipe
>> ssh     28828 mjscosta    3u  IPv4 1839820               TCP *:58133 (LISTEN)
>> ssh     28828 mjscosta    4u  IPv4 1839821               TCP *:49063 (LISTEN)
>> ssh     28828 mjscosta    5r  FIFO     0,6           1839822 pipe
>> ssh     28828 mjscosta    6u  IPv4 1839827               TCP
>> gorgon001.lnec.pt:51911->gorgon001.lnec.pt:49063 (ESTABLISHED)
>> ssh     28828 mjscosta    7u  IPv4 1839838               TCP
>> gorgon001.lnec.pt:50989->gorgon002.lnec.pt:ssh (ESTABLISHED)
>> ssh     28828 mjscosta    8w  FIFO     0,6           1839823 pipe
>> ssh     28828 mjscosta    9w  FIFO     0,6           1839829 pipe
>> ssh     28828 mjscosta   10w  FIFO     0,6           1839824 pipe
>> ssh     28828 mjscosta   11r  FIFO     0,6           1839830 pipe
>> ssh     28828 mjscosta   12w  FIFO     0,6           1839832 pipe
>> ssh     28828 mjscosta   13r  FIFO     0,6           1839831 pipe
>> ssh     28828 mjscosta   14w  FIFO     0,6           1839839 pipe
>> ssh     28828 mjscosta   15w  FIFO     0,6           1839833 pipe
>> ssh     28828 mjscosta   16r  FIFO     0,6           1839840 pipe
>> ssh     28828 mjscosta   17w  FIFO     0,6           1839832 pipe
>> ssh     28828 mjscosta   18w  FIFO     0,6           1839833 pipe
>>
>> 2. mpiexec.hydra -bootstrap fork -n 1 ssh gorgon002 sleep 5m : /bin/true
>>
>> $>ps auxf
>> mjscosta 27653  0.0  0.0  10100  2416 pts/1    Ss   01:00   0:00  |
>>    \_ -bash
>> mjscosta 28870  0.0  0.0   6072   704 pts/1    S+   03:03   0:00  |
>>        \_ mpiexec.hydra -bootstrap fork -n 1 ssh gorgon002 sleep 5m :
>> /bin/true
>> mjscosta 28871  0.0  0.0   6216   756 pts/1    S+   03:03   0:00  |
>>            \_ /usr/bin/pmi_proxy --launch-mode 1 --proxy-port
>> gorgon001 44391 --bootstrap fork --proxy-id 0
>> mjscosta 28872  0.4  0.0  24064  2504 pts/1    S+   03:03   0:00  |
>>                \_ ssh gorgon002 sleep 5m
>> mjscosta 28873  0.0  0.0      0     0 pts/1    Z+   03:03   0:00  |
>>                \_ [true] <defunct>
>>
>> $>lsof 28872
>> COMMAND   PID     USER   FD   TYPE  DEVICE    SIZE      NODE NAME
>> ...
>> ssh     28872 mjscosta    0r  FIFO     0,6           1839988 pipe <<
>> ssh     28872 mjscosta    1w  FIFO     0,6           1839989 pipe
>> ssh     28872 mjscosta    2w  FIFO     0,6           1839990 pipe
>> ssh     28872 mjscosta    3u  IPv4 1839979               TCP *:41804 (LISTEN)
>> ssh     28872 mjscosta    4u  IPv4 1839980               TCP *:44391 (LISTEN)
>> ssh     28872 mjscosta    5r  FIFO     0,6           1839981 pipe
>> ssh     28872 mjscosta    6u  IPv4 1839986               TCP
>> gorgon001.lnec.pt:45713->gorgon001.lnec.pt:44391 (ESTABLISHED)
>> ssh     28872 mjscosta    7r  FIFO     0,6           1839988 pipe
>> ssh     28872 mjscosta    8w  FIFO     0,6           1839982 pipe
>> ssh     28872 mjscosta    9u  IPv4 1839997               TCP
>> gorgon001.lnec.pt:58955->gorgon002.lnec.pt:ssh (ESTABLISHED)
>> ssh     28872 mjscosta   10w  FIFO     0,6           1839983 pipe
>> ssh     28872 mjscosta   12w  FIFO     0,6           1839989 pipe
>> ssh     28872 mjscosta   13w  FIFO     0,6           1839989 pipe
>> ssh     28872 mjscosta   14w  FIFO     0,6           1839990 pipe
>> ssh     28872 mjscosta   15w  FIFO     0,6           1839990 pipe
>>
>> Anyway it can be solved updating to a more recent ssh version, that's
>> why you can't reproduce it, but non the less there is something in the
>> mpiexec.hydra that causes it to work depending on the order the
>> command is invoked.
>>
>> Let me know what you think about this...
>>
>> Thanks and Regards,
>>
>> 2010/1/17 Mário Costa <mario.silva.costa at gmail.com>:
>>> 2010/1/17 Pavan Balaji <balaji at mcs.anl.gov>:
>>>> On 01/16/2010 07:13 PM, Mário Costa wrote:
>>>>> I have one question, does mpiexec.hydra agregates the outputs from all
>>>>> launched mpi processes ?
>>>> Yes.
>>>>
>>>>> I think it might hang waiting for the output of ssh, that for some
>>>>> reason doesn't come out, could this be the case ?
>>>> Yes, that's my guess too. This behavior is also possible if the MPI
>>>> processes hang. But an ssh problem seems more likely in this case. In
>>>> the previous email, when you tried a non-MPI program, did it hang as well?
>>> Yes, the same, in a deterministic way ...
>>>> % mpiexec.hydra -rmk pbs hostname
>>>>
>>>>> Here we use ldap in the nodes of the cluster, I've read something
>>>>> about ssh processes getting defunct due to ldap ...
>>>> Hmm.. This keeps getting more and more interesting :-).
>>>>
>>>>  -- Pavan
>>>>
>>>> --
>>>> Pavan Balaji
>>>> http://www.mcs.anl.gov/~balaji
>>>>
>>>
>>>
>>> --
>>> Mário Costa
>>>
>>> Laboratório Nacional de Engenharia Civil
>>> LNEC.CTI.NTIEC
>>> Avenida do Brasil 101
>>> 1700-066 Lisboa, Portugal
>>> Tel : ++351 21 844 3911
>>>
>
> --
> Pavan Balaji
> http://www.mcs.anl.gov/~balaji
>


More information about the mpich-discuss mailing list