[Swift-devel] on the semantics of 'array closing'

Fri Jun 15 15:13:23 CDT 2007

There is a problem that has been called the 'array closing problem'.

It manifests itself in the tutorial in that certain bits of code that 
intuitively can either in a procedure or in the top level can, in 
practice, only go in to a procedure.

In that context, I tried to think about better ways to explain/document 
the behaviour than "mumble mumble move that code into a procedure".

In Swift we claim to have 'single assignment variables'.

>From single assignment variables we get our grid job ordering:

  a = p()
  b = s(a)

causes first grid job p to run, and when that has completed, then grid job 
s will run.

This is the same as if we had written:

  b = s(a)
  a = p()

The ordering comes from the use of a as an 'output' for p and an 'input' 
for s, not from source text ordering.

In that model, its meaningless to assign two different things ta a, like 
this:

  a = p()
  b = s(a)
  a = t()

Note that I've omitted the data types from the above. This works in the 
implementation for simple types such as a datafile marker type.

What is important is that each variable is either unassigned or has its 
single value - whenever we refer to that variable, we can either use the 
value it has, or defer evaluation of that expression until the variable 
has its value.

Now consider arrays. In the present syntax, arrays can be passed as 
single (complex) values to/from procedures, like before:

  a = p()
  b = s(a)

Here a and b are array types.

That's fine. a is assigned to by the first statement, and b is assigned to 
by the second statement.

But we also support a different assignment syntax for arrays, that looks 
like this:

  a[0] = p()
  a[1] = q()
  b = s(a)

This fails at the moment (specifically, I think the execution engine will 
hang).

Why? Because the is no one point at which we assign a value to 'a' - the 
assignment is split over multiple statements, which can be in various 
places (and inside loops etc).

There is nothing in the implementation that detects that a has been 
assigned its value.

So there is this notion in the karajan intermediate code of 'closing an 
array'.  This is an assertion made in the object code that all assignments 
to pieces of an array have been made - that, in affect, the array has its 
value.

The suggested hack/workaround for this is to move the array element 
assignments into a procedure:

 (file f[]) z() {
   f[0] = p();
   f[1] - q();
 }

 a = z()
 b = s(a)

This works. (which is sort-of a violation of referential transparency)

It works because Swift implicitly marks arrays returned from compound 
procedures as closed (which may or may not be correct).

So in most variable scopes, arrays behave like single-assignment 
variables, but each array can have one specific scope in which members can 
be assigned to. In that scope, the array cannot be treated as a whole 
variable.

In the z() example above, that special scope is the body of z(). In the 
previous example, that scope is the global scope, and the program is 
invalid by the rule above that the array cannot be referred to as a whole 
in the same place that its members are individually assigned to.

That's my explanation of what's going on now. I think it matches reality. 
I don't like that this is reality, but it is what we have.

Comments appreciated.

--