[Swift-devel] Re: Is there a way to make swift send 256 tasks at a time
Michael Wilde
wilde at mcs.anl.gov
Tue Jul 22 22:40:35 CDT 2008
Such a mechanism or setting would be nice.
Im not sure about the fancy formula below with logs, atans and pi's,
though. Ben or Mihael will need to comment.
What I see in the Users Guide is simpler:
http://www.ci.uchicago.edu/swift/guides/userguide.php#property.throttle.score.job.factor
which says:
" The Swift scheduler has the ability to limit the number of concurrent
jobs allowed on a site based on the performance history of that site.
Each site is assigned a score (initially 1), which can increase or
decrease based on whether the site yields successful or faulty job runs.
The score for a site can take values in the (0.1, 100) interval. The
number of allowed jobs is calculated using the following formula:
2 + score*throttle.score.job.factor
This means a site will always be allowed at least two concurrent jobs
and at most 2 + 100*throttle.score.job.factor. With a default of 4 this
means at least 2 jobs and at most 402.
This parameter can also be set per site using the jobThrottle profile
key in a site catalog entry."
Also sec 13, Profiles, says:
"jobThrottle - allows the job throttle factor (see Swift property
throttle.score.job.factor) to be set per site.
initialScore - allows the initial score for rate limiting and site
selection to be set to a value other than 0."
So while its not as precise a setting as is desired for this case, you
can, I think, set initialScore to 100, thereby eliminating the ramp-up
period.
But, I suspect you want *exactly* 256 to keep all the cores in each pset
busy without over-committing jobs. So what I think the best you can do
is start with the throttle at 3, and the score at 85 which will give you
3 * 85 + 2 = 257 jobs. If the throttle creeps up for good behavior, then
you'll wind up with a little overcomiting, which isnt bad. It wont go
over 302 with a throttle of 3.
I suspect we could try a patch that prevents the score from changing.
Can you tell from Falkon logs as we ramp up testing of this, whether
overcomiting is a problem? I suspect at worst it will exacerbate
tail-effects as the workflow is winding down and some resources sit idle
while others are overcomited.
Lastly, note that the fact that the score can drop might eventually
prove useful in handling psets that go bad in the middle of long runs.
We dont have enough experience yet to see how frequent that will be at
640 psets, or how such errors wil be manifested.
- Mike
On 7/22/08 9:04 PM, Zhao Zhang wrote:
> Hi,
>
> For now, I understand how swift decide how many tasks send to one site
> at a time:
> let T = 100
> let B = 2.0 * log(T) / pi
> let C = 0.2
> let tscore = e^(B * atan(C * score))
> let number-of-jobs = 1 + (jobThrottle * tscore)
> T is initial score.
>
> I am wondering that , is there a way for swift to set a constant number
> of jobs, say 256 at a time to 1 site?
> The reason I am asking this is that we could avoid the slow start
> period, thus improve the efficiency. Thanks
>
> best wishes
> zhangzhao
>
More information about the Swift-devel
mailing list