[Swift-devel] Re: coasters problem - identified?

Michael Wilde wilde at mcs.anl.gov
Tue Mar 31 07:27:06 CDT 2009


Its possible that with only 5 jobs, and the throttle settings you have 
(with the sites.xml that you used from Sarah) that the 5 jobs get 5 
coasters started before the provide or swift realizes that there are 
already running coasters there that can be used.

I suspect we need to test at a large scale to see if this is really 
happening or not. I further suspect that in Sarah's tests, the system 
was indeed using the 16 coasters per node. So its most likely that when 
all is well, that feature is working.

In your earlier tests, when you observed coasters started way more 
coaster jobs than you had jobs in your workflow, I think the cause there 
was that the coasters were failing quickly. We saw this once when you 
had the wrong project number specified, but then I think you saw this 
again after that was corrected, when a 1-job workflow ran OK (confirming 
that the project id was correct) but a slightly larger workflow seemed 
to be spawning quickly-dying coasters.

On 3/31/09 2:03 AM, Glen Hocky wrote:
> it seems the problem with coasters is that it is not propertly using the 
> 16 cores on ranger
> I'm running a swift script which should (and does) run 5 invocations of 
> runoops and swift asked for 5x16 cores even with coastersPerNode at 16



More information about the Swift-devel mailing list