[Swift-devel] on the semantics of 'array closing'

Mihael Hategan hategan at mcs.anl.gov
Tue Jun 19 03:55:30 CDT 2007


On Mon, 2007-06-18 at 21:54 -0500, Ian Foster wrote:
> Interesting issue, as raised by Mike: it seems that we want to define
> a "map" function that can specify various "conditions" on the map,
> e.g., the length of time to wait for results, what to do if not all
> results are returned, etc. I wonder if others have done that before?

There are two types of errors and time-outs:
1. The ones that occur as part of the workflow:
 - Computations on some data does not return the expected files
 - The user wants to run multiple algorithms on some data and only
select the fastest one or the ones that run in a specific time.
 - etc.
2. The ones that occur because of exceptional conditions in the system:
 - A badly configured site fails to run things
 - A job sits for a long time in a queue

I think the first class can be handled in the language, but the second
should not.

Occam seems to support such things, by composing various keywords (e.g.
PAR ... FOR and then timeouts or error handling). I personally favor
composition of smaller dedicated functions/keywords to big, do-it-all
functions.

Mihael

> 
> Ian Foster wrote: 
> > I like the notion of having a "map" function. If that could entirely
> > replace the current element assignments, that would be a wonderful
> > simplification, it seems to me.
> > 
> > Ian.
> > 
> > Ben Clifford wrote: 
> > > There's a different approach, which is to asay that 'a' is a variable and 
> > > can be assigned to once. Thus assignemnt syntax like a[0]=something 
> > > becomes illegal and we need more functional language constructs. So 
> > > instead of writing:
> > > 
> > > for e,i in input_array {
> > >   output_array[i] = p(e);
> > > }
> > > 
> > > we would write:
> > > 
> > > output_array = foreach i in input_array {
> > >   return p(i);
> > > }
> > > 
> > > (its a haskell map in different syntax!)
> > > 
> > > That means that, at the language level, output_array is now properly 
> > > single assignment.
> > > 
> > > 
> > > On Fri, 15 Jun 2007, Ian Foster wrote:
> > > 
> > >   
> > > > Hi,
> > > > 
> > > > For:
> > > > 
> > > >  a[0] = p()
> > > >  a[1] = q()
> > > >  b = s(a)
> > > > 
> > > > I think there are two distinct issues.
> > > > 
> > > > a) Determining the size of the array. This could presumably be done by
> > > > declaring it, e.g.:
> > > > 
> > > >  a[2] or some similar notion
> > > >  a[0] = p()
> > > >  a[1] = q()
> > > >  b = s(a)
> > > > 
> > > > or by some "closing" concept.
> > > > 
> > > > b) Whether or not each element of an array is a separate single-assignment
> > > > variable. If they are, then the code above should work just fine. If they are
> > > > not, then we have a couple of behaviors we could define. One would be that
> > > > b=s(a) blocks until all elements in "a" are defined. The other is that we have
> > > > a way of "closing" (once again). In that case, we have to define what happens
> > > > if b=s(a) accesses an element that is not defined.
> > > > 
> > > > Ian.
> > > > 
> > > > Ben Clifford wrote:
> > > >     
> > > > > There is a problem that has been called the 'array closing problem'.
> > > > > 
> > > > > It manifests itself in the tutorial in that certain bits of code that
> > > > > intuitively can either in a procedure or in the top level can, in practice,
> > > > > only go in to a procedure.
> > > > > 
> > > > > In that context, I tried to think about better ways to explain/document the
> > > > > behaviour than "mumble mumble move that code into a procedure".
> > > > > 
> > > > > In Swift we claim to have 'single assignment variables'.
> > > > > 
> > > > > >From single assignment variables we get our grid job ordering:
> > > > > 
> > > > >   a = p()
> > > > >   b = s(a)
> > > > > 
> > > > > causes first grid job p to run, and when that has completed, then grid job s
> > > > > will run.
> > > > > 
> > > > > This is the same as if we had written:
> > > > > 
> > > > >   b = s(a)
> > > > >   a = p()
> > > > > 
> > > > > The ordering comes from the use of a as an 'output' for p and an 'input' for
> > > > > s, not from source text ordering.
> > > > > 
> > > > > In that model, its meaningless to assign two different things ta a, like
> > > > > this:
> > > > > 
> > > > >   a = p()
> > > > >   b = s(a)
> > > > >   a = t()
> > > > > 
> > > > > 
> > > > > Note that I've omitted the data types from the above. This works in the
> > > > > implementation for simple types such as a datafile marker type.
> > > > > 
> > > > > What is important is that each variable is either unassigned or has its
> > > > > single value - whenever we refer to that variable, we can either use the
> > > > > value it has, or defer evaluation of that expression until the variable has
> > > > > its value.
> > > > > 
> > > > > Now consider arrays. In the present syntax, arrays can be passed as single
> > > > > (complex) values to/from procedures, like before:
> > > > > 
> > > > >   a = p()
> > > > >   b = s(a)
> > > > > 
> > > > > Here a and b are array types.
> > > > > 
> > > > > That's fine. a is assigned to by the first statement, and b is assigned to
> > > > > by the second statement.
> > > > > 
> > > > > But we also support a different assignment syntax for arrays, that looks
> > > > > like this:
> > > > > 
> > > > >   a[0] = p()
> > > > >   a[1] = q()
> > > > >   b = s(a)
> > > > > 
> > > > > This fails at the moment (specifically, I think the execution engine will
> > > > > hang).
> > > > > 
> > > > > Why? Because the is no one point at which we assign a value to 'a' - the
> > > > > assignment is split over multiple statements, which can be in various places
> > > > > (and inside loops etc).
> > > > > 
> > > > > There is nothing in the implementation that detects that a has been assigned
> > > > > its value.
> > > > > 
> > > > > So there is this notion in the karajan intermediate code of 'closing an
> > > > > array'.  This is an assertion made in the object code that all assignments
> > > > > to pieces of an array have been made - that, in affect, the array has its
> > > > > value.
> > > > > 
> > > > > The suggested hack/workaround for this is to move the array element
> > > > > assignments into a procedure:
> > > > > 
> > > > >  (file f[]) z() {
> > > > >    f[0] = p();
> > > > >    f[1] - q();
> > > > >  }
> > > > > 
> > > > >  a = z()
> > > > >  b = s(a)
> > > > > 
> > > > > This works. (which is sort-of a violation of referential transparency)
> > > > > 
> > > > > It works because Swift implicitly marks arrays returned from compound
> > > > > procedures as closed (which may or may not be correct).
> > > > > 
> > > > > So in most variable scopes, arrays behave like single-assignment variables,
> > > > > but each array can have one specific scope in which members can be assigned
> > > > > to. In that scope, the array cannot be treated as a whole variable.
> > > > > 
> > > > > In the z() example above, that special scope is the body of z(). In the
> > > > > previous example, that scope is the global scope, and the program is invalid
> > > > > by the rule above that the array cannot be referred to as a whole in the
> > > > > same place that its members are individually assigned to.
> > > > > 
> > > > > That's my explanation of what's going on now. I think it matches reality. I
> > > > > don't like that this is reality, but it is what we have.
> > > > > 
> > > > > Comments appreciated.
> > > > > 
> > > > >   
> > > > >       
> > > 
> > >   
> > 
> > -- 
> > 
> >    Ian Foster, Director, Computation Institute
> > Argonne National Laboratory & University of Chicago
> > Argonne: MCS/221, 9700 S. Cass Ave, Argonne, IL 60439
> > Chicago: Rm 405, 5640 S. Ellis Ave, Chicago, IL 60637
> > Tel: +1 630 252 4619.  Web: www.ci.uchicago.edu.
> >       Globus Alliance: www.globus.org.
> >   
> > 
> > ____________________________________________________________________
> > 
> > _______________________________________________
> > Swift-devel mailing list
> > Swift-devel at ci.uchicago.edu
> > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
> >   
> 
> -- 
> 
>    Ian Foster, Director, Computation Institute
> Argonne National Laboratory & University of Chicago
> Argonne: MCS/221, 9700 S. Cass Ave, Argonne, IL 60439
> Chicago: Rm 405, 5640 S. Ellis Ave, Chicago, IL 60637
> Tel: +1 630 252 4619.  Web: www.ci.uchicago.edu.
>       Globus Alliance: www.globus.org.
> _______________________________________________
> Swift-devel mailing list
> Swift-devel at ci.uchicago.edu
> http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel




More information about the Swift-devel mailing list