[Swift-devel] Standard Swift coaster behavior doesnt work well for sporadic jobs

Michael Wilde wilde at mcs.anl.gov
Mon Oct 11 20:10:36 CDT 2010


----- "Jonathan Monette" <jon.monette at gmail.com> wrote:

> Wouldn't coasters just re-submit jobs if there are no workers 
> available to process them? 

Thats certainly the desired behavior for the default "automatic" mode, but it doesnt appear to be working that way - unless Ive broken it with a local mod.

- Mike


> My Montage stuff is under the assumption
> 
> that coasters will submit more workers if they all time out.  This
> maybe 
> why my stuff was hanging before.  Not entirely for sure since I am 
> working on another problem.


> On 10/11/2010 03:12 PM, wilde at mcs.anl.gov wrote:
> > Mihael, Justin, does the following sound like a likely coaster
> issue:
> >
> > When using the standard Swift coaster code (not passive or
> persistent), if I have a job that runs at the start, and then there is
> a long delay before the next job, such that the coaster worker times
> out, then the coaster scheduler doesnt think that there is a valid
> block into which the job can fit, and Swift just hangs, with the job
> in submitted state but never getting assigned to a block.
> >
> > The behavior seems similar to what you see when you try to run a job
> that doesnt fit into any block that you have defined using the
> coasters sites.xml parameters: in that case, too, Swift just hangs.
> >
> > Both of these situations (whether or not they are indeed due to the
> same algorithmic issue) seem to be problems that we need to address.
> In the first case (which is my immediate and more important problem)
> you can see a log for the problem in ~wilde/swift-rserver-hangs, along
> with the swift script, tc, sites, and properties  file(cf).
> >
> > Thanks,
> >
> > Mike
> >
> 
> -- 
> Jon
> 
> Computers are incredibly fast, accurate, and stupid. Human beings are
> incredibly slow, inaccurate, and brilliant. Together they are powerful
> beyond imagination.
> - Albert Einstein

-- 
Michael Wilde
Computation Institute, University of Chicago
Mathematics and Computer Science Division
Argonne National Laboratory




More information about the Swift-devel mailing list