[Swift-devel] execution.retries

Mihael Hategan hategan at mcs.anl.gov
Tue Jun 10 12:39:56 CDT 2008


On Tue, 2008-06-10 at 17:09 +0000, Ben Clifford wrote:
> On Tue, 10 Jun 2008, lixi at uchicago.edu wrote:
> 
> > However during the execution, the site's score could be 
> > decreased into negative one which would erase at lease 2 
> > jobs limit?
> 
> There are two ways of expressing the site score. One can range from 
> -infinity to +infinity (approximately). Call this 'score'.
> 
> This is then scaled using a complex formula to a value ranging between 2 
> and the maximum allowed onto that site - this number is then used to 
> determine how many jobs can run at once on a site.
> 
> As the first score goes to -infinity, the second score goes down to 2, but 
> no lower.
> 
> Look in 
> cog/modules/karajan//src/org/globus/cog/karajan/scheduler/WeightedHost.java
> 
> let T = 100
> let B = 2.0 * log(T) / pi
> let C = 0.2
> let tscore = e^(B * atan(C * score))
> let number-of-jobs = 2 + (jobThrottle * tscore)
> 
> I think that by editing the definition of isOverloaded() in that file, you 
> can vary the behaviour (Mihael might comment if that is actually the 
> method used to determine whether we can submit to a site or not)

That's the one. However, I think that tscores <1 should be translated
into timed rate limitations. So if tscore = 10 means I can submit at
most jobThrottle*10 jobs, tscore = 0.1 should mean that I can submit
jobs no faster than some_number/tscore seconds. Like an exponential
back-off.

We'd probably figure some_number by looking at how the score would
evolve should all attempts fail, and set a minimum waiting time that we
want in the worst case scenario (which will probably be the -10 score
limit anyway).

> 




More information about the Swift-devel mailing list