[Swift-devel] spread changing or lots of big jobs left behind

Allan Espinosa aespinosa at cs.uchicago.edu
Tue Jun 29 11:10:20 CDT 2010


Based on the attached screenshot, swift now requested for 1036 nodes
which is much greater than 198 of the maxnodes. The swift session log
reports 1450 jobs submitted.  I would have expected ~10360 jobs
submitted since i use the default overallocation factor of 10.

During the first few hours/ minutes of the workflow, swift was still
submitting small jobs (10~20 workers per slot) along with the large
request.  Based on this observation, does swift request the entire
spread per batch?

Also, how does jobThrottle now factor into submitted jobs and
corresponding slots?

Latest session status:
Progress:  Initializing:7675  Submitted:1450  Failed:89  Finished
successfully:7811

My foreach.max.threads=73 which translates to 5,329 concurrent jobs at a time.

Here's my sites.xml entry:
  <pool handle="TERAPORT">
    <execution provider="coaster" url="tp-grid1.ci.uchicago.edu"
jobmanager="gt2:gt2:pbs"
      />
    <!--<execution provider="coaster" url="none" jobmanager="local:pbs"-->
      <!--/>-->

    <profile namespace="globus" key="maxTime">14400</profile>
    <profile namespace="globus" key="maxNodes">198</profile>
    <profile namespace="globus" key="spread">0.8</profile>
    <profile namespace="globus" key="slots">10</profile>
    <profile namespace="globus" key="remoteMonitorEnabled">true</profile>

    <profile namespace="globus" key="queue">short</profile>

    <profile namespace="karajan" key="initialScore">1500.0</profile>
    <profile namespace="karajan" key="jobThrottle">1.98</profile>

    <gridftp  url="gsiftp://tp-grid1.ci.uchicago.edu"/>
    <!--<gridftp  url="local://localhost"/>-->
    <workdirectory>/gpfs/teraport/OSG/data/aespinosa/swift_scratch</workdirectory>
  </pool>

thanks,
-Allan


-- 
Allan M. Espinosa <http://amespinosa.wordpress.com>
PhD student, Computer Science
University of Chicago <http://people.cs.uchicago.edu/~aespinosa>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: spread.png
Type: image/png
Size: 8234 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/swift-devel/attachments/20100629/194197a6/attachment.png>


More information about the Swift-devel mailing list