[Swift-devel] estranged on ranger

skenny at uchicago.edu skenny at uchicago.edu
Thu Mar 19 21:53:04 CDT 2009


hey there, i'm having some trouble figuring out why my
gigantic workflow is failing :) all the details are below...i
should mention also that i ran 10k jobs with the same configs
and it completed w/o err in about 28min. 

so i'm trying to run the 65k workflow with the latest
build from svn. the workflow completes 244 of the jobs and
then begins failing. it never returns an error but seems to
hang for quite some time (though all jobs have left the q). 

from the properties file:

lazy.errors=false
caching.algorithm=LRU
pgraph=false
pgraph.graph.options=splines="compound", rankdir="TB"
pgraph.node.options=color="seagreen", style="filled"
clustering.enabled=false
clustering.queue.delay=4
clustering.min.time=60

kickstart.enabled=maybe
kickstart.always.transfer=false
wrapperlog.always.transfer=false

throttle.submit=6
throttle.host.submit=3

throttle.score.job.factor=8
throttle.transfers=16

throttle.file.operations=16
sitedir.keep=true
execution.retries=2

replication.enabled=false
replication.min.queue.time=60
replication.limit=3
foreach.max.threads=1024

from sites:
 <!-- RANGER @ tg-login.ranger.tacc.teragrid.org -->
  <pool handle="RANGER">
    <profile namespace="karajan" key="initialScore">1</profile>
    <profile namespace="karajan" key="jobThrottle">8</profile>
    <profile namespace="globus"
key="project">TG-DBS090006</profile>
    <filesystem provider="coaster"
url="gt2://gatekeeper.ranger.tacc.teragrid.org"/>
    <profile namespace="globus" key="coastersPerNode">16</profile>
    <execution provider="coaster"
url="gatekeeper.ranger.tacc.teragrid.org"
jobManager="gt2:gt2:SGE"/>
   
<workdirectory>/scratch/projects/tg/SIDGrid/sidgrid_out/{username}</workdirectory>
  </pool>

ran the log plot:

http://www.ci.uchicago.edu/~skenny/sem/report-modgenproc-20090319-2002-b0nthqyg/index.html

the log itself is here on ranger:
/scratch/projects/tg/SIDGrid/swift-logs/skenny/modgenproc-20090319-1513-m5tlihce.log

thoughts? ideas of what i might try to tweak?

thanks!

~skenny




More information about the Swift-devel mailing list