[Swift-devel] Re: coasters problem - identified?
Michael Wilde
wilde at mcs.anl.gov
Tue Mar 31 07:27:06 CDT 2009
Its possible that with only 5 jobs, and the throttle settings you have
(with the sites.xml that you used from Sarah) that the 5 jobs get 5
coasters started before the provide or swift realizes that there are
already running coasters there that can be used.
I suspect we need to test at a large scale to see if this is really
happening or not. I further suspect that in Sarah's tests, the system
was indeed using the 16 coasters per node. So its most likely that when
all is well, that feature is working.
In your earlier tests, when you observed coasters started way more
coaster jobs than you had jobs in your workflow, I think the cause there
was that the coasters were failing quickly. We saw this once when you
had the wrong project number specified, but then I think you saw this
again after that was corrected, when a 1-job workflow ran OK (confirming
that the project id was correct) but a slightly larger workflow seemed
to be spawning quickly-dying coasters.
On 3/31/09 2:03 AM, Glen Hocky wrote:
> it seems the problem with coasters is that it is not propertly using the
> 16 cores on ranger
> I'm running a swift script which should (and does) run 5 invocations of
> runoops and swift asked for 5x16 cores even with coastersPerNode at 16
More information about the Swift-devel
mailing list