[Swift-devel] data dependency guts
Ben Clifford
benc at hawaga.org.uk
Wed Oct 10 17:06:54 CDT 2007
At present, procedures and the main program are compiled to karajan code
in two sections:
* a declaration section
* a statements section
When a variable is declared, the kml code to make a karjan-level variable
of the same name goes into the declaration section.
Non-variable declaration code such as procedure calls or foreach loops go
into the statements sections.
Statements which set the value of variables appear in either the
declaration section or the statements section, depending on their
particular nature (for example, initialisation with expressions goes into
the declaration section; initialisation with the return value of a
procedure goes into the statements section).
When this code is executed in karajan, first all the declarations are
executed in sequence; then when all of those have finished, all of the
statements are executed in parallel.
If there are data dependencies, those will cause an ordering of the
parallel statements in the statements block such that they don't actually
execute in parallel, but are instead ordered by their data dependencies.
That data dependency management in the statements section happens through
the variables which are declared in the declarations section; when a
declaration happens for a variable that doesn't have an initial value,
that variable instead stores (amongst other things) an indication that it
doesn't have a value (yet).
Data dependency ordering of execution will only happen for statements in
the statements section, not for anything in the declaration section.
Mapper declarations also go in the declaration section. This means that
their parameters do not participate in data dependency ordering.
For example, you can't say this in the present (r1339) implementation:
type file;
string s;
file f <single_file_mapper;file=s>;
s="foo";
This will fail (with an exception) because some initialisation will happen
(or rather will attempt to happen) for f strictly before s is given a
value (by strictly before, I mean always before, rather than a race to
perhaps be before, perhaps after).
There does not seem to be an immediately correct easy solution to this.
One idea I have toyed with a little is doing more compile-time analysis of
dataflow to generalise the 'declaration block -> statement block'
serialisation, so that there is more serialised/parallel specification in
the kml.
Another is to put add a new concept of 'not mapped yet' to datasets; and
allow the same dataflow-ordered execution model that populates values also
be used to populate mapping configuration.
--
More information about the Swift-devel
mailing list