[Swift-devel] Expected behavior for scheduler slow-start with coasters?

Michael Wilde wilde at mcs.anl.gov
Mon Apr 6 17:53:17 CDT 2009


OK, sounds reasonable.

For what its worth, Glen provided another example of coasters going idle 
while there are jobs ready to run.

Nothing more to say on this, except to point out that it affects more 
than just startup.

Is there a simpler, alternate scheduler algorithm that you could plug in 
as a global, settable alternative to the current one when all sites are 
using coasters?

(No need to answer that now; we'll see how far we can get with things as 
they are, in various combinations of sites and settings).

We're digging into the imbalance problem at the moment, that one may be 
more worthwhile your time, as is the larger-node-per-job allocation 
enhancement.)

--- from Glen:

again, not using there coasters affectively
5:42
Michael Wilde
?
5:42
Glen Hocky
e.g.
qb now has qb2:
 
Req'd  Req'd   Elap
Job ID               Username Queue    Jobname    SessID NDS   TSK 
Memory Time  S Time
-------------------- -------- -------- ---------- ------ ----- --- 
------ ----- - -----
94741.qb2            hockyg   workq    scheduler_  30786     1   1    -- 
  01:41 R 00:53
94742.qb2            hockyg   workq    scheduler_  31391     1   1    -- 
  01:41 R 00:53
94808.qb2            hockyg   workq    scheduler_   2274     1   1    -- 
  01:41 R 00:22
94809.qb2            hockyg   workq    scheduler_  27186     1   1    -- 
  01:41 R 00:21
94811.qb2            hockyg   workq    scheduler_  31647     1   1    -- 
  01:41 R 00:21
94812.qb2            hockyg   workq    scheduler_   4773     1   1    -- 
  01:41 R 00:18
but only 4 active jobs
4 submitted
*7 submitted
all the rest done
so what is it doing with all those extra cpus
5:43
...

Glen Hocky
for my run on only qb
Progress:  Submitted:7 Active:4 Finished successfully:93
5:43

Glen Hocky
again, the problem may be that these jobs are taking 15 minutes or more
so they don't end very often


On 4/6/09 5:43 PM, Mihael Hategan wrote:
> On Mon, 2009-04-06 at 17:32 -0500, Michael Wilde wrote:
>> OK, this one seems to be more of a nuisance/anomaly that we can set 
>> aside for now I think.
>>
>> Opening up the throttle a bit should make this a minor issue. 
>> Eventually, you'd hope it would fill available coasters when there is 
>> demand, or at least base the rampup on the fast that jobs started, and 
>> not wait for them to finish. Then it would sense faster that there were 
>> more ready workers.
> 
> Yes. I mentioned this a while ago, that with coasters, throttling
> guesses become unnecessary. You simply throttle to the number of
> available workers.
> 
> This, however, falls out of the model we started with, so there are some
> possibly non-trivial changes to swift needed in order to support this
> with coasters, while still keeping the old behaviour without coasters.
> 
> 
> 



More information about the Swift-devel mailing list