[Swift-devel] on the semantics of 'array closing'

Yong Zhao yongzh at cs.uchicago.edu
Fri Jun 15 15:40:29 CDT 2007


Yes, the case is exactly like you have described. Currently each a[i] is
closed separately, but the whole array also needs to be closed. For
instance, if in s, only a[0] and a[1] are accessed, it might go through
correctly, but if s accesses all elements of a (where it has no idea how
many there are), the workflow would hang to wait for the array to close.

Mihael and I talked about closing statement, but it is unclear when it
should be done since the order of each a[i] being closed is not
deterministic in parallel execution.

Yong.

On Fri, 15 Jun 2007, Ian Foster wrote:

> Hi,
>
> For:
>
>   a[0] = p()
>   a[1] = q()
>   b = s(a)
>
> I think there are two distinct issues.
>
> a) Determining the size of the array. This could presumably be done by
> declaring it, e.g.:
>
>   a[2] or some similar notion
>   a[0] = p()
>   a[1] = q()
>   b = s(a)
>
> or by some "closing" concept.
>
> b) Whether or not each element of an array is a separate
> single-assignment variable. If they are, then the code above should work
> just fine. If they are not, then we have a couple of behaviors we could
> define. One would be that b=s(a) blocks until all elements in "a" are
> defined. The other is that we have a way of "closing" (once again). In
> that case, we have to define what happens if b=s(a) accesses an element
> that is not defined.
>
> Ian.
>
> Ben Clifford wrote:
> > There is a problem that has been called the 'array closing problem'.
> >
> > It manifests itself in the tutorial in that certain bits of code that
> > intuitively can either in a procedure or in the top level can, in
> > practice, only go in to a procedure.
> >
> > In that context, I tried to think about better ways to explain/document
> > the behaviour than "mumble mumble move that code into a procedure".
> >
> > In Swift we claim to have 'single assignment variables'.
> >
> > >From single assignment variables we get our grid job ordering:
> >
> >   a = p()
> >   b = s(a)
> >
> > causes first grid job p to run, and when that has completed, then grid job
> > s will run.
> >
> > This is the same as if we had written:
> >
> >   b = s(a)
> >   a = p()
> >
> > The ordering comes from the use of a as an 'output' for p and an 'input'
> > for s, not from source text ordering.
> >
> > In that model, its meaningless to assign two different things ta a, like
> > this:
> >
> >   a = p()
> >   b = s(a)
> >   a = t()
> >
> >
> > Note that I've omitted the data types from the above. This works in the
> > implementation for simple types such as a datafile marker type.
> >
> > What is important is that each variable is either unassigned or has its
> > single value - whenever we refer to that variable, we can either use the
> > value it has, or defer evaluation of that expression until the variable
> > has its value.
> >
> > Now consider arrays. In the present syntax, arrays can be passed as
> > single (complex) values to/from procedures, like before:
> >
> >   a = p()
> >   b = s(a)
> >
> > Here a and b are array types.
> >
> > That's fine. a is assigned to by the first statement, and b is assigned to
> > by the second statement.
> >
> > But we also support a different assignment syntax for arrays, that looks
> > like this:
> >
> >   a[0] = p()
> >   a[1] = q()
> >   b = s(a)
> >
> > This fails at the moment (specifically, I think the execution engine will
> > hang).
> >
> > Why? Because the is no one point at which we assign a value to 'a' - the
> > assignment is split over multiple statements, which can be in various
> > places (and inside loops etc).
> >
> > There is nothing in the implementation that detects that a has been
> > assigned its value.
> >
> > So there is this notion in the karajan intermediate code of 'closing an
> > array'.  This is an assertion made in the object code that all assignments
> > to pieces of an array have been made - that, in affect, the array has its
> > value.
> >
> > The suggested hack/workaround for this is to move the array element
> > assignments into a procedure:
> >
> >  (file f[]) z() {
> >    f[0] = p();
> >    f[1] - q();
> >  }
> >
> >  a = z()
> >  b = s(a)
> >
> > This works. (which is sort-of a violation of referential transparency)
> >
> > It works because Swift implicitly marks arrays returned from compound
> > procedures as closed (which may or may not be correct).
> >
> > So in most variable scopes, arrays behave like single-assignment
> > variables, but each array can have one specific scope in which members can
> > be assigned to. In that scope, the array cannot be treated as a whole
> > variable.
> >
> > In the z() example above, that special scope is the body of z(). In the
> > previous example, that scope is the global scope, and the program is
> > invalid by the rule above that the array cannot be referred to as a whole
> > in the same place that its members are individually assigned to.
> >
> > That's my explanation of what's going on now. I think it matches reality.
> > I don't like that this is reality, but it is what we have.
> >
> > Comments appreciated.
> >
> >
>
> --
>
>    Ian Foster, Director, Computation Institute
> Argonne National Laboratory & University of Chicago
> Argonne: MCS/221, 9700 S. Cass Ave, Argonne, IL 60439
> Chicago: Rm 405, 5640 S. Ellis Ave, Chicago, IL 60637
> Tel: +1 630 252 4619.  Web: www.ci.uchicago.edu.
>       Globus Alliance: www.globus.org.
>
> _______________________________________________
> Swift-devel mailing list
> Swift-devel at ci.uchicago.edu
> http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
>



More information about the Swift-devel mailing list