[Swift-devel] Standard Swift coaster behavior doesnt work well for sporadic jobs

Jonathan Monette jon.monette at gmail.com
Wed Oct 20 10:39:19 CDT 2010


  Has this problem been fixed?  I am still experiencing hanging in my 
scripts and it seems that the jobs are submitted but never executed.  I 
see that the stage out is finished, the jobs are submitted, and then the 
coaster heartbeat in the logfile.

On 10/11/10 8:10 PM, Michael Wilde wrote:
> ----- "Jonathan Monette"<jon.monette at gmail.com>  wrote:
>
>> Wouldn't coasters just re-submit jobs if there are no workers
>> available to process them?
> Thats certainly the desired behavior for the default "automatic" mode, but it doesnt appear to be working that way - unless Ive broken it with a local mod.
>
> - Mike
>
>
>> My Montage stuff is under the assumption
>>
>> that coasters will submit more workers if they all time out.  This
>> maybe
>> why my stuff was hanging before.  Not entirely for sure since I am
>> working on another problem.
>
>> On 10/11/2010 03:12 PM, wilde at mcs.anl.gov wrote:
>>> Mihael, Justin, does the following sound like a likely coaster
>> issue:
>>> When using the standard Swift coaster code (not passive or
>> persistent), if I have a job that runs at the start, and then there is
>> a long delay before the next job, such that the coaster worker times
>> out, then the coaster scheduler doesnt think that there is a valid
>> block into which the job can fit, and Swift just hangs, with the job
>> in submitted state but never getting assigned to a block.
>>> The behavior seems similar to what you see when you try to run a job
>> that doesnt fit into any block that you have defined using the
>> coasters sites.xml parameters: in that case, too, Swift just hangs.
>>> Both of these situations (whether or not they are indeed due to the
>> same algorithmic issue) seem to be problems that we need to address.
>> In the first case (which is my immediate and more important problem)
>> you can see a log for the problem in ~wilde/swift-rserver-hangs, along
>> with the swift script, tc, sites, and properties  file(cf).
>>> Thanks,
>>>
>>> Mike
>>>
>> -- 
>> Jon
>>
>> Computers are incredibly fast, accurate, and stupid. Human beings are
>> incredibly slow, inaccurate, and brilliant. Together they are powerful
>> beyond imagination.
>> - Albert Einstein

-- 
Jon

Computers are incredibly fast, accurate, and stupid. Human beings are incredibly slow, inaccurate, and brilliant. Together they are powerful beyond imagination.
- Albert Einstein




More information about the Swift-devel mailing list