[mpich-discuss] Hydra unable to execute jobs that use more than one node(host) under PBS RMK

Reuti reuti at staff.uni-marburg.de
Sun Jan 17 07:46:30 CST 2010


Am 17.01.2010 um 13:47 schrieb Mário Costa:

> 2010/1/17 Pavan Balaji <balaji at mcs.anl.gov>:
>>
>> On 01/16/2010 07:13 PM, Mário Costa wrote:
>>> I have one question, does mpiexec.hydra agregates the outputs  
>>> from all
>>> launched mpi processes ?
>>
>> Yes.
>>
>>> I think it might hang waiting for the output of ssh, that for some
>>> reason doesn't come out, could this be the case ?
>>
>> Yes, that's my guess too. This behavior is also possible if the MPI
>> processes hang. But an ssh problem seems more likely in this case. In
>> the previous email, when you tried a non-MPI program, did it hang  
>> as well?

Does this mean, that Hydra under PBS is not using the TM interface  
but still needs ssh?

-- Reuti


> Yes, the same, in a deterministic way ...
>>
>> % mpiexec.hydra -rmk pbs hostname
>>
>>> Here we use ldap in the nodes of the cluster, I've read something
>>> about ssh processes getting defunct due to ldap ...
>>
>> Hmm.. This keeps getting more and more interesting :-).
>>
>>  -- Pavan
>>
>> --
>> Pavan Balaji
>> http://www.mcs.anl.gov/~balaji
>>
>
>
>
> -- 
> Mário Costa
>
> Laboratório Nacional de Engenharia Civil
> LNEC.CTI.NTIEC
> Avenida do Brasil 101
> 1700-066 Lisboa, Portugal
> Tel : ++351 21 844 3911
> _______________________________________________
> mpich-discuss mailing list
> mpich-discuss at mcs.anl.gov
> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
>



More information about the mpich-discuss mailing list