[Swift-devel] concurrent mapper and restart
Ben Clifford
benc at hawaga.org.uk
Fri May 23 17:05:37 CDT 2008
For simple tests, files mapped through the concurrent mapper do get
handled apparently correctly by the present filename based restart
mechanism.
This is contradictory to bug 107 comment 5:
> This fixes the latest problem, but will not recognize as done variables
> mapped by the concurrent mapper.
which I interpret to mean that concurrently mapped values will be
recomputed unnecessarily after a restart [thus leading to inefficiency
(perhaps to the extent that the workflow can never finish in a real
failure-prone environment)]
However, I'm more worried that different restarts of a workflow will have
files mapped differently, such that sometimes a file will be mapped to a
filename that was previously used for a different file in an earlier
restart (or initial run).
That would lead to a situation where workflows might appear to complete,
but would actually be jumbling up intermediate datafiles and delivering
incorrect output results, which is extremely bad.
I haven't tried this to see if I can make it happen; nor am I sure I can
(I think its probably very sensitive to the way in which restarts interact
with foreach loops to create threads - if a restart causes threads to be
created in a different order in a foreach loop, then I think this problem
exists).
If this really is a problem, there are two approaches to avoiding this
more serious problem that spring to mind:
i) make concurrent filenames different each restart (with a per-restart
rather than per-kml-compilation unique identifier); this would change the
problem to the efficiency-reducing problem - unpleasant but not producing
incorrect results.
ii) the same lexical/runtime scope ID stuff that I talked about yesterday
for identifying variables might apply here. Instead of using a karajan
thread identifier on the end of a concurrent variable which is potentially
random, use a SwiftScript level equivalent - the SwiftScript level scope
identifiers that I talked about yesterday that I think are recreatable no
matter the order in which Karajan evaluates things. That would give
between-run repeatable mappings. Which in turn would mean filename based
restarts are perhaps not so bad anymore in general.
--
More information about the Swift-devel
mailing list