[Swift-devel] feature request

Glen Hocky hockyg at uchicago.edu
Wed Apr 15 14:05:36 CDT 2009


The problem with the first method was that the number of jobs, i.e. 
score increased too slowly. In that configuration, i believe that the 
behavior was 1 or 2 coasters were submitted to a single site and none to 
the others and then it just stayed that way for a long time.

Another problem w/ the default configuration was on sites w/ Coasters 
per node > 1. My experience on ranger with the default parameters is 
that 2 coasters would start in the queue and only ~6 jobs would run on 
them, rather than 32.  Since our jobs take ~1 hour, this means that for 
32 hours of CPU time, I was getting about 6 CPU hours of work. and even 
after jobs started finishing in that config, the ramp up was too slow

Mihael Hategan wrote:
> On Wed, 2009-04-15 at 13:49 -0500, Glen Hocky wrote:
>   
>> Hi Everyone,
>> While I'm thinking of it, one problem I had using coasters on multiple 
>> TG sites is that jobs would commit themselves to a site at the beginning 
>> of the run.
>>     
>
> If the site scoring parameters are left at the default, only a couple of
> jobs should commit to sites at start, and the number would progressively
> increase as sites complete jobs.
>
> Combined with replication, which is probably disabled by default, even
> jobs committed to sites that don't do much, should be re-submitted to
> different sites eventually.
>
>   
>>  This was a problem because all of the jobs would finish on 
>> one machine while jobs for the other machines were sitting in a queue. I 
>> know you may have considered this before, but an option to select a 
>> site/coaster for a single job only when one is available would be very 
>> useful for us (note: we aren't too worried about overhead in not 
>> pre-setting up files and directories because our jobs run > 10 minutes, 
>> usually closer to an hour).
>>     
>
> That is one thing that is planned with coasters, but hasn't materialized
> yet, in part due to the fact that the above solution should provide a
> similar experience.
>
>
>   




More information about the Swift-devel mailing list