[Swift-user] Tuning parameters of coaster execution
Michael Wilde
wilde at mcs.anl.gov
Mon Oct 19 23:26:02 CDT 2009
Hi Andriy,
We'll need to wait for Mihael to advise you on this, but there's a few
messages and threads in swift-devel that may be useful:
Ranger block scheduling:
http://mail.ci.uchicago.edu/pipermail/swift-devel/2009-September/005985.html
http://mail.ci.uchicago.edu/pipermail/swift-devel/2009-September/005986.html
Using Ranger with the latest coasters:
http://mail.ci.uchicago.edu/pipermail/swift-devel/2009-October/005994.html
Also, the following maybe helpful to force a specific number of coasters
to start and/or jobs to run on them, but I dont know how these settings
interact with the coaster "block" settings:
---
To adjust the throttle, you can use this in your sites.xml <pool> element:
<profile namespace="karajan" key="jobThrottle">2.55</profile>
<profile namespace="karajan" key="initialScore">10000</profile>
The #jobs per site is then throttled to (jobThrottle * 100) + 1 = 256
when initialScore is large enough (and 10000 is).
Eg, if you had have cores, set jobThrottle to 0.49
For 200 cores, use 1.99
etc.
If you know how many cores you have available, always set initialScore
to 10000 which bypasses the Swift "slow start".
---
Mihael, can you create a few examples of consistent parameter settings
that work well together for a few illustrative configurations?
- Mike
On 10/19/09 10:35 PM, Andriy Fedorov wrote:
> Hi,
>
> I am trying to understand how to set correctly the coaster-related
> parameters to optimize execution of my workflow. A single task I have
> takes around 1-2 minutes. I set maxWalltime to 2 minutes, and there 40
> of these tasks in my toy workflow. Coasters are configured as
> gt2:gt2:pbs. When I run it with the default parameters, the workflow
> completes (this is great!).
>
> Now I am trying to understand what's going on and how to improve the
> performance. Looking at the scheduler queue, I see that two jobs are
> submitted in the beginning of the execution for 18 min each, one with
> 1 node, and one with 2 nodes. All of the execution is happening in
> these two jobs (the number of jobs submitted is just two, for 40 taks,
> so looks like things work). First question: why does it happen this
> way? (two jobs, 18 minutes each, specific node allocation) I assume
> only one of them (2-node) is executing worker tasks, but in this case
> why allocation time is 18 minutes, not 20 (each worker walltime is 2
> min)?
>
> Second question: how do I make coaster to request more nodes? I tried
> to increase nodeGranularity to 10. This resulted in only one (!) job
> with 10 nodes and 20 min walltime showing up on the scheduler. But it
> looks like the jobs are still executed 2 at a time!
>
> Progress: Selecting site:38 Active:2
>
> According to documentation, default workersPerNode=1, so I would
> expect at least 10 to be active. Again, I don't understand what's
> going on uder the hood....
>
> Can coaster experts give me some guidance what is going on, and how to
> intelligently set the parameters?
>
> Thanks!
>
> --
> Andriy Fedorov, Ph.D.
>
> Research Fellow
> Brigham and Women's Hospital
> Harvard Medical School
> 75 Francis Street
> Boston, MA 02115 USA
> fedorov at bwh.harvard.edu
> _______________________________________________
> Swift-user mailing list
> Swift-user at ci.uchicago.edu
> http://mail.ci.uchicago.edu/mailman/listinfo/swift-user
More information about the Swift-user
mailing list