[Swift-devel] on the semantics of 'array closing'

Ben Clifford benc at hawaga.org.uk
Fri Jun 15 15:55:54 CDT 2007


There's a different approach, which is to asay that 'a' is a variable and 
can be assigned to once. Thus assignemnt syntax like a[0]=something 
becomes illegal and we need more functional language constructs. So 
instead of writing:

for e,i in input_array {
  output_array[i] = p(e);
}

we would write:

output_array = foreach i in input_array {
  return p(i);
}

(its a haskell map in different syntax!)

That means that, at the language level, output_array is now properly 
single assignment.


On Fri, 15 Jun 2007, Ian Foster wrote:

> Hi,
> 
> For:
> 
>  a[0] = p()
>  a[1] = q()
>  b = s(a)
> 
> I think there are two distinct issues.
> 
> a) Determining the size of the array. This could presumably be done by
> declaring it, e.g.:
> 
>  a[2] or some similar notion
>  a[0] = p()
>  a[1] = q()
>  b = s(a)
> 
> or by some "closing" concept.
> 
> b) Whether or not each element of an array is a separate single-assignment
> variable. If they are, then the code above should work just fine. If they are
> not, then we have a couple of behaviors we could define. One would be that
> b=s(a) blocks until all elements in "a" are defined. The other is that we have
> a way of "closing" (once again). In that case, we have to define what happens
> if b=s(a) accesses an element that is not defined.
> 
> Ian.
> 
> Ben Clifford wrote:
> > There is a problem that has been called the 'array closing problem'.
> > 
> > It manifests itself in the tutorial in that certain bits of code that
> > intuitively can either in a procedure or in the top level can, in practice,
> > only go in to a procedure.
> > 
> > In that context, I tried to think about better ways to explain/document the
> > behaviour than "mumble mumble move that code into a procedure".
> > 
> > In Swift we claim to have 'single assignment variables'.
> > 
> > >From single assignment variables we get our grid job ordering:
> > 
> >   a = p()
> >   b = s(a)
> > 
> > causes first grid job p to run, and when that has completed, then grid job s
> > will run.
> > 
> > This is the same as if we had written:
> > 
> >   b = s(a)
> >   a = p()
> > 
> > The ordering comes from the use of a as an 'output' for p and an 'input' for
> > s, not from source text ordering.
> > 
> > In that model, its meaningless to assign two different things ta a, like
> > this:
> > 
> >   a = p()
> >   b = s(a)
> >   a = t()
> > 
> > 
> > Note that I've omitted the data types from the above. This works in the
> > implementation for simple types such as a datafile marker type.
> > 
> > What is important is that each variable is either unassigned or has its
> > single value - whenever we refer to that variable, we can either use the
> > value it has, or defer evaluation of that expression until the variable has
> > its value.
> > 
> > Now consider arrays. In the present syntax, arrays can be passed as single
> > (complex) values to/from procedures, like before:
> > 
> >   a = p()
> >   b = s(a)
> > 
> > Here a and b are array types.
> > 
> > That's fine. a is assigned to by the first statement, and b is assigned to
> > by the second statement.
> > 
> > But we also support a different assignment syntax for arrays, that looks
> > like this:
> > 
> >   a[0] = p()
> >   a[1] = q()
> >   b = s(a)
> > 
> > This fails at the moment (specifically, I think the execution engine will
> > hang).
> > 
> > Why? Because the is no one point at which we assign a value to 'a' - the
> > assignment is split over multiple statements, which can be in various places
> > (and inside loops etc).
> > 
> > There is nothing in the implementation that detects that a has been assigned
> > its value.
> > 
> > So there is this notion in the karajan intermediate code of 'closing an
> > array'.  This is an assertion made in the object code that all assignments
> > to pieces of an array have been made - that, in affect, the array has its
> > value.
> > 
> > The suggested hack/workaround for this is to move the array element
> > assignments into a procedure:
> > 
> >  (file f[]) z() {
> >    f[0] = p();
> >    f[1] - q();
> >  }
> > 
> >  a = z()
> >  b = s(a)
> > 
> > This works. (which is sort-of a violation of referential transparency)
> > 
> > It works because Swift implicitly marks arrays returned from compound
> > procedures as closed (which may or may not be correct).
> > 
> > So in most variable scopes, arrays behave like single-assignment variables,
> > but each array can have one specific scope in which members can be assigned
> > to. In that scope, the array cannot be treated as a whole variable.
> > 
> > In the z() example above, that special scope is the body of z(). In the
> > previous example, that scope is the global scope, and the program is invalid
> > by the rule above that the array cannot be referred to as a whole in the
> > same place that its members are individually assigned to.
> > 
> > That's my explanation of what's going on now. I think it matches reality. I
> > don't like that this is reality, but it is what we have.
> > 
> > Comments appreciated.
> > 
> >   
> 
> 



More information about the Swift-devel mailing list