[Swift-devel] feature request

Mihael Hategan hategan at mcs.anl.gov
Wed Apr 15 14:22:05 CDT 2009


On Wed, 2009-04-15 at 14:05 -0500, Glen Hocky wrote:
> The problem with the first method was that the number of jobs, i.e. 
> score increased too slowly. In that configuration, i believe that the 
> behavior was 1 or 2 coasters were submitted to a single site and none to 
> the others and then it just stayed that way for a long time.

The behavior you mention is contrary to what should be happening, in
that all sites should have had 2 swift jobs submitted to.

It is possible that you've made the observation while coasters were not
working properly on certain sites. The solution to that is not to
fundamentally re-engineer the way swift submission works, but to make
coasters run properly on those sites.

> 
> Another problem w/ the default configuration was on sites w/ Coasters 
> per node > 1. My experience on ranger with the default parameters is 
> that 2 coasters would start in the queue and only ~6 jobs would run on 
> them, rather than 32.  Since our jobs take ~1 hour, this means that for 
> 32 hours of CPU time, I was getting about 6 CPU hours of work. and even 
> after jobs started finishing in that config, the ramp up was too slow

That is indeed a scenario which cannot be addressed by the scheme I
mentioned. Telling swift that a site has a certain granularity, and 2
jobs eat the same resources as 16 jobs, is not something we have. Though
I think the scoring could easily be adapted for that.





More information about the Swift-devel mailing list