[Swift-devel] Re: scheduler changes to deal with fast-failing sites

Mihael Hategan hategan at mcs.anl.gov
Wed Jun 25 11:05:14 CDT 2008


> In addition such improvements, as well as filtering out some 
> sites and giving initial scores which I've done, I am 
> thinking of other methods these days. Now in Swift, we only 
> reply on "scores" to determine the performance of sites 
> which are in turn the only metrics for site selection. Can 
> we set the different states for sites? For example, 
> candidate, frozen, etc. "Candidate" just means that we could 
> select site from them based on their scores/Tscores. If the 
> site fails, we could designate it as "frozen", at least for 
> the current job, avoiding more retries would be eaten up. A 
> frozen site could be unfrozen for satisfying different 
> conditions, such as an amount of time later, for other new 
> jobs. Of course, this is some simple ideas which I'm 
> thinking now. I am going to give more detailed and feasible 
> process. Any suggestions are warmly welcome.

The current system pretty much does that, though in a slower way, which
is desired because occasional errors don't necessarily mean the site is
bad.

> 
> Thanks,
> 
> Xi 
> _______________________________________________
> Swift-devel mailing list
> Swift-devel at ci.uchicago.edu
> http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel




More information about the Swift-devel mailing list