[Swift-devel] 0.9 release notes draft
Ben Clifford
benc at hawaga.org.uk
Fri Apr 24 08:59:53 CDT 2009
On Fri, 24 Apr 2009, Michael Wilde wrote:
> > *** when replication is enabled, swift will locally kill jobs that have
> > run for twice their specified walltime
>
> what does "locally kill" mean?
traditionally the maxwalltime setting was passed to the execution layer
and would end up somewhere like PBS for enforcement deep in the stack. If
something was broken or not implemented somewhere between Swift and that
enforcer, then maxwalltimes would have no effect.
Local enforcement means that the Swift client will attempt to kill a job
that 2* a specified maxwalltime, without relying on communication with
anythign outside of the swit client.
This is tied in with replication in that its behaviour that happens when
the swift client believes a job has been in a particular state for too
long.
For queued state, the time is based on a simple analysis of existing queue
times, and enforcement behaviour is to launch a replica.
For active (running) state, the time is based on maxwall tim, and
enforcement behaviour is to kill the job, which will then cause a retry
(or too-many-retries failure)
> > *** Recompilation will happen if a .kml file was compiled with a different
>
> This was an occasional problem: does this close the last know glitch in
> avoiding recompilation, or are there any other cases where a user could get
> tripped up on this? I.e., ideally should the user not need to know that Swift
> applies this heuristic?
I think this catches everything in this area that has been catching people
in the past.
--
More information about the Swift-devel
mailing list