[Swift-devel] swift jobs hanging on ranger

skenny at uchicago.edu skenny at uchicago.edu
Wed Jan 21 12:48:22 CST 2009


so, ranger requires a project/account be specified for
any job going into the scheduler...i can submit a
globus-job-run and also a cog-job-submit with my current
project id and it works. then when i submit with swift
(specifying that same project id in my sites file) it hangs
and does not make it into the queue:

[skenny at gwynn check_env]$ cog-job-submit -p gt2 -jm SGE -a
project=TG-DBS090006 -e /bin/hostname -s
gatekeeper.ranger.tacc.teragrid.org
Forcing redirection because the SGE JM is broken.
Job completed

[skenny at gwynn check_env]$ globus-job-run
gatekeeper.ranger.tacc.teragrid.org/jobmanager-sge -p
TG-DBS090006 /bin/hostname
...
Job is running.
Job 452453 has completed.

[skenny at gwynn check_env]$ swift -tc.file
/disks/ci-gpfs/fmri/cnari/swift/config/tc.data -sites.file
./sites_ranger.xml env.swift -user="skenny"
Swift svn swift-r2386 cog-r2261

RunID: 20090121-1228-s2aha776
Progress:
env started
env started
env started
Progress:  Selecting site:2 Stage in:1
Progress:  Submitted:3
...

**********the sites file entry is:

<!-- RANGER @ tg-login.ranger.tacc.teragrid.org -->
 <pool handle="RANGER">
   <profile namespace="karajan" key="initialScore">1</profile>
   <profile namespace="karajan" key="jobThrottle">8</profile>
   <profile namespace="globus"
key="project">TG-DBS090006</profile>
   <filesystem provider="coaster"
url="gt2://gatekeeper.ranger.tacc.teragrid.org"/>
   <profile namespace="globus" key="coastersPerNode">16</profile>
   <execution provider="coaster"
url="gatekeeper.ranger.tacc.teragrid.org"
jobManager="gt2:gt2:SGE"/>
  
<workdirectory>/work/00043/tg457040/sidgrid_out/{username}</workdirectory>
 </pool>

******gram log on remote site contains this:

Wed Jan 21 12:36:33 2009 JM_SCRIPT: Checking project details
Wed Jan 21 12:36:33 2009 JM_SCRIPT:   SGE Regular Edition: NO
project support
Wed Jan 21 12:36:33 2009 JM_SCRIPT:   WARNING: Project set to
TG-DBS090006

....

Wed Jan 21 12:36:33 2009 JM_SCRIPT: Submitting a job
Wed Jan 21 12:36:33 2009 JM_SCRIPT:   ERROR:
/opt/sge/bin/lx24-amd64/qsub
/share/home/00043/tg457040/.globus/.gass_cac\
he/local/md5/89/ff0dc3a3eeffb3ca94dabdf57b8473/md5/4a/a95989876bc53f3f82515f4507c280/data
retcode = 256


Wed Jan 21 12:36:33 2009 JM_SCRIPT:   ERROR: job submission failed
Wed Jan 21 12:36:33 2009 JM_SCRIPT:   check if the project
specified does exist
1/21 12:36:33 JMI: while return_buf = GRAM_SCRIPT_ERROR = 24


hopefully i'm not missing something obvious here, but can
anyone think of a reason why the project id is producing an
error when i submit with swift?

thanks!!
sarah



More information about the Swift-devel mailing list