[mpich-discuss] Fwd: SGE & Hydra Problem

Pavan Balaji balaji at mcs.anl.gov
Tue Sep 14 13:02:12 CDT 2010


Thanks Sayantan. I'm cc'ing mpich-discuss for everyone's information.

  -- Pavan

On 09/14/2010 08:14 AM, Sayantan Sur wrote:
> Hi Pavan,
>
> Slight correction. MVAPICH-1.2 will support it, as Adam has
> contributed patches for it, but it doesn't support it yet. We are yet
> to incorporate all the patches and make the final release.
>
> Thanks.
>
>
> ---------- Forwarded message ----------
> From: Pavan Balaji<balaji at mcs.anl.gov>
> Date: Tue, Sep 14, 2010 at 4:46 AM
> Subject: Re: [mpich-discuss] SGE&  Hydra Problem
> To: mpich-discuss at mcs.anl.gov
>
>
>
> On 09/14/2010 02:37 AM, Ursula Winkler wrote:
>>>>
>>>>      error: getting configuration: unable to contact qmaster using
>>>> port 536 on host "b00"
>>>>      error:
>>>>      Cannot get configuration from qmaster.
>>>>
>>>
>>> This looks more like a network problem, and unrelated to SGE or
>>> MPICH2. Dou you have any firewall on the machines? Other applications
>>> run across the nodes? AFAICS below, SGE is using rsh, and not the
>>> default -builtin- of the newer versions of SGE (there would be no rsh/
>>> rshd any longer) - nevertheless, your setup should work.
>>>
>> Within the cluster there is no firewall. There are no other applications
>> running
>> accross the nodes. The setup works for MPICH1, and MPICH2 smpd, just with
>> Hydra are the problems. I also can not see any network problems.
>> The more mysterious, Hydra works fine on another cluster (with same OS
>> and SGE).
>> Hmm.
>
> Can you run this by passing the -verbose option to mpiexec? It'll give
> some more output to help us debug it.
>
>>> In principle this looks nice, as all the processes are bound to the
>>> sgeexecd. This is what I tried to achieve in:
>>>
>>> http://lists.mcs.anl.gov/pipermail/mpich-discuss/2010-August/007678.html
>>>
>> yes, the tight integration works fine! (I'd be happy if it were the same
>> with mvapich.)
>
> Hydra will work out-of-the-box with MVAPICH2 (or any other derivative
> of MPICH2). I believe the latest version of MVAPICH-1 also supports
> the PMI interface, and hence Hydra and all other MPICH2 process
> managers.
>
>   -- Pavan
>
> --
> Pavan Balaji
> http://www.mcs.anl.gov/~balaji
> _______________________________________________
> mpich-discuss mailing list
> mpich-discuss at mcs.anl.gov
> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
>
>
>
>

-- 
Pavan Balaji
http://www.mcs.anl.gov/~balaji


More information about the mpich-discuss mailing list