[mpich-discuss] SGE & Hydra Problem
Pavan Balaji
balaji at mcs.anl.gov
Wed Sep 22 06:24:08 CDT 2010
----- "Ursula Winkler" <ursula.winkler at uni-graz.at> wrote:
> No, when mpiexec is placed within the SGE job script, it works fine on
> the second
> cluster. I meant just the command "qrsh -inherit -V ...
> hydra_pmi_proxy
> ..." placed
> within the SGE script that results in the mentioned error message (on
> both clusters).
Ok, just to confirm, if nodes X and Y are both in the $TMPDIR/machines file, you are running the qrsh command from node X to node Y, correct?
I'm surprised that this is not working on the second cluster, as this is exactly what Hydra does internally.
Can you run mpiexec (from within an SGE script) for both cluster with the -verbose option and send me the outputs?
% mpiexec -verbose /bin/hostname
-- Pavan
More information about the mpich-discuss
mailing list