[Swift-devel] execution.retries
Mihael Hategan
hategan at mcs.anl.gov
Tue Jun 10 12:39:56 CDT 2008
On Tue, 2008-06-10 at 17:09 +0000, Ben Clifford wrote:
> On Tue, 10 Jun 2008, lixi at uchicago.edu wrote:
>
> > However during the execution, the site's score could be
> > decreased into negative one which would erase at lease 2
> > jobs limit?
>
> There are two ways of expressing the site score. One can range from
> -infinity to +infinity (approximately). Call this 'score'.
>
> This is then scaled using a complex formula to a value ranging between 2
> and the maximum allowed onto that site - this number is then used to
> determine how many jobs can run at once on a site.
>
> As the first score goes to -infinity, the second score goes down to 2, but
> no lower.
>
> Look in
> cog/modules/karajan//src/org/globus/cog/karajan/scheduler/WeightedHost.java
>
> let T = 100
> let B = 2.0 * log(T) / pi
> let C = 0.2
> let tscore = e^(B * atan(C * score))
> let number-of-jobs = 2 + (jobThrottle * tscore)
>
> I think that by editing the definition of isOverloaded() in that file, you
> can vary the behaviour (Mihael might comment if that is actually the
> method used to determine whether we can submit to a site or not)
That's the one. However, I think that tscores <1 should be translated
into timed rate limitations. So if tscore = 10 means I can submit at
most jobThrottle*10 jobs, tscore = 0.1 should mean that I can submit
jobs no faster than some_number/tscore seconds. Like an exponential
back-off.
We'd probably figure some_number by looking at how the score would
evolve should all attempts fail, and set a minimum waiting time that we
want in the worst case scenario (which will probably be the -10 score
limit anyway).
>
More information about the Swift-devel
mailing list