[Swift-devel] coasters about half the jobs
Mihael Hategan
hategan at mcs.anl.gov
Sat Feb 19 16:33:36 CST 2011
On Sat, 2011-02-19 at 10:15 -0600, Michael Wilde wrote:
> Mihael, I need to correct one point I made below:
>
> > But the scheduling behavior still shows a similar problem, in that
> > only about half the cores are utilized.
>
> As I was pasting the output I realized that the workers *were* getting
> filled to 100%, but then later the utilization dropped off and did not
> seem to recover.
>
> In later experiments I tested with a single large coaster block of
> compute nodes instead of many small one-node blocks. This showed some
> interesting behavior (but I think much better utilization). There I
> got an oscilating pattern, where I would have all 2400 nodes utilized,
> then what seemed like sinusoidal dips to about 2100 cores, then back
> up to 2400, etc. (I cant tell without plotting if its really in fact
> a sinusoid).
It's oscillating, but overall (assuming some non-trivial distribution of
job durations) it should be close to a decaying sine that tends to a
value somewhat less than the maximum number of workers. This is a known
"problem" when you have a delay between completion and the submission of
new jobs. It causes transients around job waves as long as there are
such waves (i.e. lots of jobs being run at once). And we've seen this on
the BGP and Sarah has also seen this earlier on Ranger. The wider the
job time distribution, the quicker the decay of the oscillatory pattern.
The solution is to make sure that the coaster service always has enough
jobs queued. And enough here is, I would suggest, about twice the amount
of maximum workers. So try this: set maximum workers (wpn*slots) to half
the site throttle (or the site throttle and the foreach max threads to
twice wpn*slots). This way, even if one wave of jobs completes at once,
the coaster service will immediately have enough jobs queued that can be
started immediately on the available workers.
Mihael
More information about the Swift-devel
mailing list