[Swift-devel] on the semantics of 'array closing'

Sat Jun 16 04:12:36 CDT 2007

On Fri, 2007-06-15 at 15:26 -0500, Ian Foster wrote:
> Hi,
> 
> For:
> 
>   a[0] = p()
>   a[1] = q()
>   b = s(a)
> 
> I think there are two distinct issues.
> 
> a) Determining the size of the array. This could presumably be done by 
> declaring it, e.g.:
> 
>   a[2] or some similar notion
>   a[0] = p()
>   a[1] = q()
>   b = s(a)
> 
> or by some "closing" concept.

Right!

> 
> b) Whether or not each element of an array is a separate 
> single-assignment variable.

They are. And it should, provided that the a[2] declaration marks the
array as "closed".

>  If they are, then the code above should work 
> just fine. If they are not, then we have a couple of behaviors we could 
> define. One would be that b=s(a) blocks until all elements in "a" are 
> defined. The other is that we have a way of "closing" (once again). In 
> that case, we have to define what happens if b=s(a) accesses an element 
> that is not defined.

IndexOutOfBoundsException.

Another thing we explored mentally was the possibility of doing a simple
analysis and grouping all assignments to an array. I'll use an example:

a[0] = 1;
b = c;
a[1] = 9;
d = f(5);
a[2] = 7;

This normally gets translated into (some initializations omitted and
function names changed for clarity):
parallel(
  setarray(a, 0, 1)
  alias(b, c)
  setarray(a, 1, 9)
  set(d, f(5))
  setarray(a, 2, 7)
)

The "proposed" solution would be to translate into:
parallel(
  alias(b, c)
  set(d, f(5))
  sequential(
    parallel(
      setarray(a, 0, 1)
      setarray(a, 1, 9)
      setarray(a, 2, 7)
    )
    closearray(a)
  )
)

Mihael

> 
> Ian.
> 
> Ben Clifford wrote:
> > There is a problem that has been called the 'array closing problem'.
> >
> > It manifests itself in the tutorial in that certain bits of code that 
> > intuitively can either in a procedure or in the top level can, in 
> > practice, only go in to a procedure.
> >
> > In that context, I tried to think about better ways to explain/document 
> > the behaviour than "mumble mumble move that code into a procedure".
> >
> > In Swift we claim to have 'single assignment variables'.
> >
> > >From single assignment variables we get our grid job ordering:
> >
> >   a = p()
> >   b = s(a)
> >
> > causes first grid job p to run, and when that has completed, then grid job 
> > s will run.
> >
> > This is the same as if we had written:
> >
> >   b = s(a)
> >   a = p()
> >
> > The ordering comes from the use of a as an 'output' for p and an 'input' 
> > for s, not from source text ordering.
> >
> > In that model, its meaningless to assign two different things ta a, like 
> > this:
> >
> >   a = p()
> >   b = s(a)
> >   a = t()
> >
> >
> > Note that I've omitted the data types from the above. This works in the 
> > implementation for simple types such as a datafile marker type.
> >
> > What is important is that each variable is either unassigned or has its 
> > single value - whenever we refer to that variable, we can either use the 
> > value it has, or defer evaluation of that expression until the variable 
> > has its value.
> >
> > Now consider arrays. In the present syntax, arrays can be passed as 
> > single (complex) values to/from procedures, like before:
> >
> >   a = p()
> >   b = s(a)
> >
> > Here a and b are array types.
> >
> > That's fine. a is assigned to by the first statement, and b is assigned to 
> > by the second statement.
> >
> > But we also support a different assignment syntax for arrays, that looks 
> > like this:
> >
> >   a[0] = p()
> >   a[1] = q()
> >   b = s(a)
> >
> > This fails at the moment (specifically, I think the execution engine will 
> > hang).
> >
> > Why? Because the is no one point at which we assign a value to 'a' - the 
> > assignment is split over multiple statements, which can be in various 
> > places (and inside loops etc).
> >
> > There is nothing in the implementation that detects that a has been 
> > assigned its value.
> >
> > So there is this notion in the karajan intermediate code of 'closing an 
> > array'.  This is an assertion made in the object code that all assignments 
> > to pieces of an array have been made - that, in affect, the array has its 
> > value.
> >
> > The suggested hack/workaround for this is to move the array element 
> > assignments into a procedure:
> >
> >  (file f[]) z() {
> >    f[0] = p();
> >    f[1] - q();
> >  }
> >
> >  a = z()
> >  b = s(a)
> >
> > This works. (which is sort-of a violation of referential transparency)
> >
> > It works because Swift implicitly marks arrays returned from compound 
> > procedures as closed (which may or may not be correct).
> >
> > So in most variable scopes, arrays behave like single-assignment 
> > variables, but each array can have one specific scope in which members can 
> > be assigned to. In that scope, the array cannot be treated as a whole 
> > variable.
> >
> > In the z() example above, that special scope is the body of z(). In the 
> > previous example, that scope is the global scope, and the program is 
> > invalid by the rule above that the array cannot be referred to as a whole 
> > in the same place that its members are individually assigned to.
> >
> > That's my explanation of what's going on now. I think it matches reality. 
> > I don't like that this is reality, but it is what we have.
> >
> > Comments appreciated.
> >
> >   
>