[Swift-devel] swift jobs hanging on ranger
skenny at uchicago.edu
skenny at uchicago.edu
Wed Jan 21 12:48:22 CST 2009
so, ranger requires a project/account be specified for
any job going into the scheduler...i can submit a
globus-job-run and also a cog-job-submit with my current
project id and it works. then when i submit with swift
(specifying that same project id in my sites file) it hangs
and does not make it into the queue:
[skenny at gwynn check_env]$ cog-job-submit -p gt2 -jm SGE -a
project=TG-DBS090006 -e /bin/hostname -s
gatekeeper.ranger.tacc.teragrid.org
Forcing redirection because the SGE JM is broken.
Job completed
[skenny at gwynn check_env]$ globus-job-run
gatekeeper.ranger.tacc.teragrid.org/jobmanager-sge -p
TG-DBS090006 /bin/hostname
...
Job is running.
Job 452453 has completed.
[skenny at gwynn check_env]$ swift -tc.file
/disks/ci-gpfs/fmri/cnari/swift/config/tc.data -sites.file
./sites_ranger.xml env.swift -user="skenny"
Swift svn swift-r2386 cog-r2261
RunID: 20090121-1228-s2aha776
Progress:
env started
env started
env started
Progress: Selecting site:2 Stage in:1
Progress: Submitted:3
...
**********the sites file entry is:
<!-- RANGER @ tg-login.ranger.tacc.teragrid.org -->
<pool handle="RANGER">
<profile namespace="karajan" key="initialScore">1</profile>
<profile namespace="karajan" key="jobThrottle">8</profile>
<profile namespace="globus"
key="project">TG-DBS090006</profile>
<filesystem provider="coaster"
url="gt2://gatekeeper.ranger.tacc.teragrid.org"/>
<profile namespace="globus" key="coastersPerNode">16</profile>
<execution provider="coaster"
url="gatekeeper.ranger.tacc.teragrid.org"
jobManager="gt2:gt2:SGE"/>
<workdirectory>/work/00043/tg457040/sidgrid_out/{username}</workdirectory>
</pool>
******gram log on remote site contains this:
Wed Jan 21 12:36:33 2009 JM_SCRIPT: Checking project details
Wed Jan 21 12:36:33 2009 JM_SCRIPT: SGE Regular Edition: NO
project support
Wed Jan 21 12:36:33 2009 JM_SCRIPT: WARNING: Project set to
TG-DBS090006
....
Wed Jan 21 12:36:33 2009 JM_SCRIPT: Submitting a job
Wed Jan 21 12:36:33 2009 JM_SCRIPT: ERROR:
/opt/sge/bin/lx24-amd64/qsub
/share/home/00043/tg457040/.globus/.gass_cac\
he/local/md5/89/ff0dc3a3eeffb3ca94dabdf57b8473/md5/4a/a95989876bc53f3f82515f4507c280/data
retcode = 256
Wed Jan 21 12:36:33 2009 JM_SCRIPT: ERROR: job submission failed
Wed Jan 21 12:36:33 2009 JM_SCRIPT: check if the project
specified does exist
1/21 12:36:33 JMI: while return_buf = GRAM_SCRIPT_ERROR = 24
hopefully i'm not missing something obvious here, but can
anyone think of a reason why the project id is producing an
error when i submit with swift?
thanks!!
sarah
More information about the Swift-devel
mailing list