[mpich-discuss] SGE & Hydra Problem
Pavan Balaji
balaji at mcs.anl.gov
Wed Sep 15 03:11:38 CDT 2010
----- "Ursula Winkler" <ursula.winkler at uni-graz.at> wrote:
> [mpiexec at b79] Launch arguments:
> /installadmin/mpich2/test/intel/bin/hydra_pmi_proxy --control-port
> b79:45593 --debug --demux poll --pgid 0 --enable-stdin 1 --proxy-id 0
> [mpiexec at b79] Launch arguments: /installadmin/sge/bin/lx24-amd64/qrsh
>
> -inherit -V b51 /installadmin/mpich2/test/intel/bin/hydra_pmi_proxy
> --control-port b79:45593 --debug --demux poll --pgid 0 --enable-stdin
> 1
> --proxy-id 1
I'm assuming the application still hung at this point and you had to kill it?
> > % /installadmin/sge/bin/lx24-amd64/qrsh -inherit -V b56
> > /installadmin/mpich2/test/intel/bin/hydra_pmi_proxy --control-port
> > b73:52298 --debug --demux poll --pgid 0 --enable-stdin 1 --proxy-id
> 1
> >
> error: "qrsh" called with option "-inherit", but "JOB_ID" not set in
> environment
>
> export JOB_ID=158269
> [root at b00 ~]# /installadmin/sge/bin/lx24-amd64/qrsh -inherit -V b56
> /installadmin/mpich2/test/intel/bin/hydra_pmi_proxy --control-port
> b73:52298 --debug --demux poll --pgid 0 --enable-stdin 1 --proxy-id 1
> error: executing task of job 158269 failed: missing "SGE_TASK_ID" in
> environment
>
> I do not know to what value I should set SGE_TASK_ID so I always get
> an
> error with "error:
> executing task of job 158275 failed"
Are you not running this command within an SGE job script? The qrsh command should be run from b79, not from b00.
-- Pavan
More information about the mpich-discuss
mailing list