[Swift-devel] examining the plots of a 65535 job CNARI run
Ben Clifford
benc at hawaga.org.uk
Thu Sep 25 09:11:02 CDT 2008
On Wednesday, skenny ran a 65535 run which mostly finished.
The plots are here:
http://www.ci.uchicago.edu/~benc/tmp/report-modelproc-20080924-1226-pkzripi7/
The rest of this email is rambling commentary on some of the things I see
there.
The run mostly finishes, with some number (985 according to the totals of
unfinished procedure calls, 8 according to the execute2 chart, and 11
according to the karajan statuses) of activities outstanding.
Looking at this chart, which is karajan job submission tasks,
http://www.ci.uchicago.edu/~benc/tmp/report-modelproc-20080924-1226-pkzripi7/karatasks.JOB_SUBMISSION.sorted-start.png
there are strange things with karajan job duration. The majority of tasks
run very quickly (a few pixels width, which is a few seconds). That's
expected.
A large number though take what looks to be about 2000 seconds to end (and
seemingly all are about the same duration, which maybe means its a timeout
on the task itself);
and a few (about 9?) never finish (those are the lines that extend all the
way from their respective start times all the way to the far right of the
graph)
The tasks that take about 2000 seconds look like they're going into Queued
state - looking at the plot of karajan job submission tasks in queued
state, they appear there too:
http://www.ci.uchicago.edu/~benc/tmp/report-modelproc-20080924-1226-pkzripi7/karatasks.JOB_SUBMISSION.Queue.sorted-start.png
There are a couple of interesting things here that I haven't seen before:
1. stagein/stageout oscillation
Coasters are providing a plenty of cores for running tasks, with very low
scheduling latency.
In this run, the execution rate is limited by the rate at which files can
be staged in.
There is a fixed load for file staging, which is shared between stageins
and stageouts.
Once a file has been staged in, the corresponding task will be executed
almost instantly, and two seconds later a stageout task will go on the
queue.
This seems to be causing a pretty-looking oscillation in the stageout and
stagein graphs. Maybe that's a bad thing, maybe it doesn't matter.
2. Execution peaks at coaster restart time.
When no coaster workers are running, stageins still happen. So when
coaster workers start up when there have been none running, there are
plenty of tasks to run. The coaster workers die every 1h45m (6300 seconds)
(due to wall time specification) and are restarted, which then is subject
to gram+sge scheduling delay.
So every 6300s in the run there is a section of the active tasks graph
where the nuber of active tasks drops to 0 for a bit and then shoots high
up to 400 tasks active at once for a very short period of time.
In the present run, I don't think this is causing any actual delay in the
total runtime of the workflow because coasters are not causing any rate
limit. In other runs with other applications, maybe that will have some
effect that might be significant.
Coasters are able to run 400 tasks at once because of what I regard as a
bug in the way that multiple cores are supported in coasters - far too
many (16 x too many in this case) cores are allocated which means where
there is a sudden peak in job submissions there are lots of cores
available. This shouldn't happen.
However even if that was fixed so that it only allocated the right number
of cores, rather than the wrong number of nodes, I think if there is a
sudden peak in jobs as happens when the coaster workers all die around the
same time due to walltime, then the worker manager will still end up
trying to allocate enough workers to cover that peak, even though the peak
is very unusual. So this will result in basically wasted coaster worker
runs.
--
More information about the Swift-devel
mailing list