[Swift-devel] Functionality request: best effort execution
Mihael Hategan
hategan at mcs.anl.gov
Mon Jul 13 20:52:22 CDT 2009
On Mon, 2009-07-13 at 20:22 -0500, Tiberiu Stef-Praun wrote:
> I am trying to control for runaway tasks, not just to simulate them.
> The scenario is for tasks which are waiting in the queue, in which
> case the wrapper script will not be able to implement the timeout
> functionality (because the tasks are not executed yet).
> For this reason, I wanted Swift to be aware of time-limited jobs, and
> give up on them without an error message (by defaulting to a "dummy"
> output).
It would make your workflow nondeterministic depending on the resources
you run, including possibly giving you only dummy results without as
much as a single complaint. Are you sure this is what you want?
In a sense, with swift, I think we're trying to eliminate this kind of
nondeterministic behavior that is common in strict language concurrency,
but that also means we need to restrict certain things.
I can see applications to this in that there are problems that are
time-sensitive (some things may only be useful if done before a certain
deadline).
So I'm unsure about the following:
- whether this is a language issue, or something for the runtime
- whether swift should support this kind of process control
- what the consequences of this would be to the system in general
(including but not limited to the possibility of implementing a "virtual
data" thing with it and the ability to have reproducible experiments).
- whether there is a middle ground, such as isolating side-effects like
this (Ben would mention haskell and monads about here).
>
> I am wondering if I can use globus::maxwalltime as a timeout mechanism ?
maxwalltime applies to the actual job (not queue times) so it's worse
than a wrapper script, because as opposed to a wrapper script where you
can gracefully supply a dummy result, violating maxwalltime results in
an error.
> My current solution is to have a task run locally and the other one
> remotely, and to use the local tasks' timeout as a barrier to
> generating the dummy output or to validating the remote result as the
> proper output.
>
> I know I am pushing the limits here, that's what I pretty much do all
> the time with Swift.
I don't think this is a discussion about mechanisms, since for that
there already is a solution in karajan called "race" (a discriminator in
"workflow" terms) which (theoretically) takes care of the cleanup
including canceling the branches that lost and any jobs that they might
have launched.
More information about the Swift-devel
mailing list