[Swift-devel] Standard Swift coaster behavior doesnt work well for sporadic jobs

Jonathan Monette jon.monette at gmail.com
Mon Oct 11 19:44:43 CDT 2010


  Wouldn't coasters just re-submit jobs if there are no workers 
available to process them?   My Montage stuff is under the assumption 
that coasters will submit more workers if they all time out.  This maybe 
why my stuff was hanging before.  Not entirely for sure since I am 
working on another problem.

On 10/11/2010 03:12 PM, wilde at mcs.anl.gov wrote:
> Mihael, Justin, does the following sound like a likely coaster issue:
>
> When using the standard Swift coaster code (not passive or persistent), if I have a job that runs at the start, and then there is a long delay before the next job, such that the coaster worker times out, then the coaster scheduler doesnt think that there is a valid block into which the job can fit, and Swift just hangs, with the job in submitted state but never getting assigned to a block.
>
> The behavior seems similar to what you see when you try to run a job that doesnt fit into any block that you have defined using the coasters sites.xml parameters: in that case, too, Swift just hangs.
>
> Both of these situations (whether or not they are indeed due to the same algorithmic issue) seem to be problems that we need to address. In the first case (which is my immediate and more important problem) you can see a log for the problem in ~wilde/swift-rserver-hangs, along with the swift script, tc, sites, and properties  file(cf).
>
> Thanks,
>
> Mike
>

-- 
Jon

Computers are incredibly fast, accurate, and stupid. Human beings are incredibly slow, inaccurate, and brilliant. Together they are powerful beyond imagination.
- Albert Einstein




More information about the Swift-devel mailing list