[mpich-discuss] SGE & Hydra Problem
    Pavan Balaji 
    balaji at mcs.anl.gov
       
    Wed Sep 22 06:24:08 CDT 2010
    
    
  
----- "Ursula Winkler" <ursula.winkler at uni-graz.at> wrote:
> No, when mpiexec is placed within the SGE job script, it works fine on
> the second
> cluster. I meant just the command "qrsh -inherit -V ...
> hydra_pmi_proxy 
> ..." placed
> within the SGE script that results in the mentioned error message (on
> both clusters).
Ok, just to confirm, if nodes X and Y are both in the $TMPDIR/machines file, you are running the qrsh command from node X to node Y, correct?
I'm surprised that this is not working on the second cluster, as this is exactly what Hydra does internally.
Can you run mpiexec (from within an SGE script) for both cluster with the -verbose option and send me the outputs?
% mpiexec -verbose /bin/hostname
 -- Pavan
    
    
More information about the mpich-discuss
mailing list