[Swift-devel] Re: Is there a way to make swift send 256 tasks at a time

Michael Wilde wilde at mcs.anl.gov
Tue Jul 22 22:40:35 CDT 2008


Such a mechanism or setting would be nice.

Im not sure about the fancy formula below with logs, atans and pi's, 
though. Ben or Mihael will need to comment.

What I see in the Users Guide is simpler:

http://www.ci.uchicago.edu/swift/guides/userguide.php#property.throttle.score.job.factor

which says:

" The Swift scheduler has the ability to limit the number of concurrent 
jobs allowed on a site based on the performance history of that site. 
Each site is assigned a score (initially 1), which can increase or 
decrease based on whether the site yields successful or faulty job runs. 
The score for a site can take values in the (0.1, 100) interval. The 
number of allowed jobs is calculated using the following formula:

2 + score*throttle.score.job.factor

This means a site will always be allowed at least two concurrent jobs 
and at most 2 + 100*throttle.score.job.factor. With a default of 4 this 
means at least 2 jobs and at most 402.

This parameter can also be set per site using the jobThrottle profile 
key in a site catalog entry."

Also sec 13, Profiles, says:

"jobThrottle - allows the job throttle factor (see Swift property 
throttle.score.job.factor) to be set per site.

initialScore - allows the initial score for rate limiting and site 
selection to be set to a value other than 0."

So while its not as precise a setting as is desired for this case, you 
can, I think, set initialScore to 100, thereby eliminating the ramp-up 
period.

But, I suspect you want *exactly* 256 to keep all the cores in each pset 
busy without over-committing jobs. So what I think the best you can do 
is start with the throttle at 3, and the score at 85 which will give you 
3 * 85 + 2 = 257 jobs. If the throttle creeps up for good behavior, then 
you'll wind up with a little overcomiting, which isnt bad. It wont go 
over 302 with a throttle of 3.

I suspect we could try a patch that prevents the score from changing.

Can you tell from Falkon logs as we ramp up testing of this, whether 
overcomiting is a problem? I suspect at worst it will exacerbate 
tail-effects as the workflow is winding down and some resources sit idle 
while others are overcomited.

Lastly, note that the fact that the score can drop might eventually 
prove useful in handling psets that go bad in the middle of long runs. 
We dont have enough experience yet to see how frequent that will be at 
640 psets, or how such errors wil be manifested.

- Mike




On 7/22/08 9:04 PM, Zhao Zhang wrote:
> Hi,
> 
> For now, I understand how swift decide how many tasks send to one site 
> at a time:
> let T = 100
> let B = 2.0 * log(T) / pi
> let C = 0.2
> let tscore = e^(B * atan(C * score))
> let number-of-jobs = 1 + (jobThrottle * tscore)
> T is initial score.
> 
> I am wondering  that , is there a way for swift to set a constant number 
> of jobs, say 256 at a time to 1 site?
> The reason I am asking this is that we could avoid the slow start 
> period, thus improve the efficiency. Thanks
> 
> best wishes
> zhangzhao
> 



More information about the Swift-devel mailing list