[mpich-discuss] Hydra unable to execute jobs that use more than one node(host) under PBS RMK
Mário Costa
mario.silva.costa at gmail.com
Tue Jan 26 10:02:34 CST 2010
Hi Pavan,
2010/1/26 Pavan Balaji <balaji at mcs.anl.gov>:
> Mario,
>
> This is good information. Yes, it shouldn't matter which process does
> the ssh, and yes it is possible that closing stdin is the culprit. Would
> you be willing to try out the trunk version of Hydra, which has a bunch
> of fixes in this area?
Sure! I will let you know how it goes...
>
> http://www.mcs.anl.gov/research/projects/mpich2/downloads/tarballs/nightly/hydra
>
> Note that the trunk has a few critical bugs that I'm working on right
> now, so these nightly tarballs are only meant for testing, and not for
> production use.
>
> -- Pavan
>
> On 01/21/2010 09:17 PM, Mário Costa wrote:
>> Hi again,
>>
>> I found out the problem comes up only on some specific ssh versions
>> (in my caseOpenSSH_4.2p1, OpenSSL 0.9.8a 11 Oct 2005), and it depends
>> on the order of the processes in the command.
>>
>> If I test
>>
>> 1. mpiexec.hydra -bootstrap fork -n 1 /bin/true : ssh gorgon002 hostname
>> I've got the problem, it hangs, and reports to stderr
>>
>> bad fd
>> ssh_keysign: no reply
>> key_sign failed
>>
>> After some googling I found this
>> (http://l-sourcemotel.gsfc.nasa.gov/pipermail/test-proj-1-commits/2006-March/000350.html),
>> which looks like the problem I have.
>>
>> 2. mpiexec.hydra -bootstrap fork -n 1 ssh gorgon002 hostname : /bin/true
>>
>> Works fine!
>>
>> Shouldn't it be the same, independently of the order ? Could you be
>> closing the stdin (or changing it) of the second exec before its time
>> ?
>>
>> I've replaced the hostname by sleep 5m to get the opened files via
>> lsof, check the difference
>>
>> 1. mpiexec.hydra -bootstrap fork -n 1 /bin/true : ssh gorgon002 sleep 5m
>>
>> $>ps auxf
>> mjscosta 27653 0.0 0.0 10100 2416 pts/1 Ss 01:00 0:00 |
>> \_ -bash
>> mjscosta 28825 0.0 0.0 6076 704 pts/1 S+ 02:59 0:00 |
>> \_ mpiexec.hydra -bootstrap fork -n 1 /bin/true : ssh gorgon002
>> sleep 5m
>> mjscosta 28826 0.0 0.0 6220 756 pts/1 S+ 02:59 0:00 |
>> \_ /usr/bin/pmi_proxy --launch-mode 1 --proxy-port
>> gorgon001 49063 --bootstrap fork --proxy-id 0
>> mjscosta 28827 0.0 0.0 0 0 pts/1 Z+ 02:59 0:00 |
>> \_ [true] <defunct>
>> mjscosta 28828 0.0 0.0 24064 2500 pts/1 S+ 02:59 0:00 |
>> \_ ssh gorgon002 sleep 5m
>> mjscosta 28829 0.0 0.0 0 0 pts/1 Z+ 02:59 0:00 |
>> \_ [ssh-keysign] <defunct>
>>
>> $>lsof 28827
>> COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME
>> ...
>> ssh 28828 mjscosta 0u IPv4 1839838 TCP
>> gorgon001.lnec.pt:50989->gorgon002.lnec.pt:ssh (ESTABLISHED) <<
>> ssh 28828 mjscosta 1w FIFO 0,6 1839832 pipe
>> ssh 28828 mjscosta 2w FIFO 0,6 1839833 pipe
>> ssh 28828 mjscosta 3u IPv4 1839820 TCP *:58133 (LISTEN)
>> ssh 28828 mjscosta 4u IPv4 1839821 TCP *:49063 (LISTEN)
>> ssh 28828 mjscosta 5r FIFO 0,6 1839822 pipe
>> ssh 28828 mjscosta 6u IPv4 1839827 TCP
>> gorgon001.lnec.pt:51911->gorgon001.lnec.pt:49063 (ESTABLISHED)
>> ssh 28828 mjscosta 7u IPv4 1839838 TCP
>> gorgon001.lnec.pt:50989->gorgon002.lnec.pt:ssh (ESTABLISHED)
>> ssh 28828 mjscosta 8w FIFO 0,6 1839823 pipe
>> ssh 28828 mjscosta 9w FIFO 0,6 1839829 pipe
>> ssh 28828 mjscosta 10w FIFO 0,6 1839824 pipe
>> ssh 28828 mjscosta 11r FIFO 0,6 1839830 pipe
>> ssh 28828 mjscosta 12w FIFO 0,6 1839832 pipe
>> ssh 28828 mjscosta 13r FIFO 0,6 1839831 pipe
>> ssh 28828 mjscosta 14w FIFO 0,6 1839839 pipe
>> ssh 28828 mjscosta 15w FIFO 0,6 1839833 pipe
>> ssh 28828 mjscosta 16r FIFO 0,6 1839840 pipe
>> ssh 28828 mjscosta 17w FIFO 0,6 1839832 pipe
>> ssh 28828 mjscosta 18w FIFO 0,6 1839833 pipe
>>
>> 2. mpiexec.hydra -bootstrap fork -n 1 ssh gorgon002 sleep 5m : /bin/true
>>
>> $>ps auxf
>> mjscosta 27653 0.0 0.0 10100 2416 pts/1 Ss 01:00 0:00 |
>> \_ -bash
>> mjscosta 28870 0.0 0.0 6072 704 pts/1 S+ 03:03 0:00 |
>> \_ mpiexec.hydra -bootstrap fork -n 1 ssh gorgon002 sleep 5m :
>> /bin/true
>> mjscosta 28871 0.0 0.0 6216 756 pts/1 S+ 03:03 0:00 |
>> \_ /usr/bin/pmi_proxy --launch-mode 1 --proxy-port
>> gorgon001 44391 --bootstrap fork --proxy-id 0
>> mjscosta 28872 0.4 0.0 24064 2504 pts/1 S+ 03:03 0:00 |
>> \_ ssh gorgon002 sleep 5m
>> mjscosta 28873 0.0 0.0 0 0 pts/1 Z+ 03:03 0:00 |
>> \_ [true] <defunct>
>>
>> $>lsof 28872
>> COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME
>> ...
>> ssh 28872 mjscosta 0r FIFO 0,6 1839988 pipe <<
>> ssh 28872 mjscosta 1w FIFO 0,6 1839989 pipe
>> ssh 28872 mjscosta 2w FIFO 0,6 1839990 pipe
>> ssh 28872 mjscosta 3u IPv4 1839979 TCP *:41804 (LISTEN)
>> ssh 28872 mjscosta 4u IPv4 1839980 TCP *:44391 (LISTEN)
>> ssh 28872 mjscosta 5r FIFO 0,6 1839981 pipe
>> ssh 28872 mjscosta 6u IPv4 1839986 TCP
>> gorgon001.lnec.pt:45713->gorgon001.lnec.pt:44391 (ESTABLISHED)
>> ssh 28872 mjscosta 7r FIFO 0,6 1839988 pipe
>> ssh 28872 mjscosta 8w FIFO 0,6 1839982 pipe
>> ssh 28872 mjscosta 9u IPv4 1839997 TCP
>> gorgon001.lnec.pt:58955->gorgon002.lnec.pt:ssh (ESTABLISHED)
>> ssh 28872 mjscosta 10w FIFO 0,6 1839983 pipe
>> ssh 28872 mjscosta 12w FIFO 0,6 1839989 pipe
>> ssh 28872 mjscosta 13w FIFO 0,6 1839989 pipe
>> ssh 28872 mjscosta 14w FIFO 0,6 1839990 pipe
>> ssh 28872 mjscosta 15w FIFO 0,6 1839990 pipe
>>
>> Anyway it can be solved updating to a more recent ssh version, that's
>> why you can't reproduce it, but non the less there is something in the
>> mpiexec.hydra that causes it to work depending on the order the
>> command is invoked.
>>
>> Let me know what you think about this...
>>
>> Thanks and Regards,
>>
>> 2010/1/17 Mário Costa <mario.silva.costa at gmail.com>:
>>> 2010/1/17 Pavan Balaji <balaji at mcs.anl.gov>:
>>>> On 01/16/2010 07:13 PM, Mário Costa wrote:
>>>>> I have one question, does mpiexec.hydra agregates the outputs from all
>>>>> launched mpi processes ?
>>>> Yes.
>>>>
>>>>> I think it might hang waiting for the output of ssh, that for some
>>>>> reason doesn't come out, could this be the case ?
>>>> Yes, that's my guess too. This behavior is also possible if the MPI
>>>> processes hang. But an ssh problem seems more likely in this case. In
>>>> the previous email, when you tried a non-MPI program, did it hang as well?
>>> Yes, the same, in a deterministic way ...
>>>> % mpiexec.hydra -rmk pbs hostname
>>>>
>>>>> Here we use ldap in the nodes of the cluster, I've read something
>>>>> about ssh processes getting defunct due to ldap ...
>>>> Hmm.. This keeps getting more and more interesting :-).
>>>>
>>>> -- Pavan
>>>>
>>>> --
>>>> Pavan Balaji
>>>> http://www.mcs.anl.gov/~balaji
>>>>
>>>
>>>
>>> --
>>> Mário Costa
>>>
>>> Laboratório Nacional de Engenharia Civil
>>> LNEC.CTI.NTIEC
>>> Avenida do Brasil 101
>>> 1700-066 Lisboa, Portugal
>>> Tel : ++351 21 844 3911
>>>
>
> --
> Pavan Balaji
> http://www.mcs.anl.gov/~balaji
>
More information about the mpich-discuss
mailing list