[Swift-devel] Standard Library Proposal/Discussion Page

Wed Jan 21 17:09:54 CST 2015

Function pointers are probably not a short-term addition - might be simpler
just to add functions we think are important and have function pointers as
a longer-term addition.

- Tim

On Wed, Jan 21, 2015 at 5:02 PM, Tim Armstrong <tim.g.armstrong at gmail.com>
wrote:

> I see your point with dense/sparse array errors - there is the prospect of
> confusing errors in either case.  I think with dense arrays, the intent is
> clearer - the users knows they have to assign elements [0..N-1] and the
> function knows it can expect elements [0..N-1].  With sparse arrays the
> user has might readily expect the function to "do the right thing" with the
> input, since the types check out - getting back to the root cause of the
> problem would require more steps of inference to get back to the root
> problem - that the function doesn't do what they expect.
>
> That transformation of adding an implicit else branch is really just a way
> of implementing sequential semantics by adding data dependencies to
> implement the sequential control dependency.  The other example of closing
> an array is a pure data dependency on the entire value of A rather than a
> control flow dependency.
>
> I don't see Swift as a sequential language that we're autoparallelising
> though.  For one, we let statements with side-effects execute out of order
> as a general rule.  Constructs like foreach are also explicitly allowed to
> execute out of order.  I.e. there is no total order in the program in which
> statements have to execute, there's just a partial order defined by data
> flow.  To work out which version of a variable is being referred to
> requires defining a total order.
>
> For example, if we involve loops:
>
> versioned x = 1;
>
> foreach i in A {
>    if (rand()) {
>      x = i;
>    }
> }
>
> I'm going to assume the intended behavior is that the loop iterations
> execute in some sequential order and each iteration depends on the
> previous.  But this requires that a) the keys of A have a defined order,
> and b) we don't start iterating over A until we know what the "first"
> element is, i.e. essentially until A is closed.
>
> But that doesn't really square with the idea that iterations of a foreach
> loop are independent that previously existed.   This seems to require that
> the presence of a variable of a particular type changes the semantics of an
> enclosing control flow structure. This is really bizarre from a language
> design point of view.
>
> I just think if we want sequential control flow we need to be realistic
> about the pros and cons that entails.
>
> - Tim
>
>
> On Wed, Jan 21, 2015 at 3:55 PM, Mihael Hategan <hategan at mcs.anl.gov>
> wrote:
>
>> On Wed, 2015-01-21 at 15:04 -0600, Tim Armstrong wrote:
>> > I'm assuming that the size of a dense array would be specified when its
>> > initialized.  Then a dense array of size n behaves the same as n
>> individual
>> > variables - there's no reason you have to assign them in order, or
>> iterate
>> > over them in order unless that's what you actually want/need to do.  I
>> > agree that trying to have dense arrays that automatically expand in a
>> > parallel language is a bad idea.
>>
>> I think what I said was in reply to your question whether dense arrays
>> would solve the problem. The answer is it would probably help, but it
>> would not lead to ideal solutions. The point would have been that you
>> can use them to emulate tail() by passing a starting index to a
>> recursive function. You need a deterministic way of going through all
>> the indices. Starting at 0 and incrementing is just the most obvious
>> example. However, the creation of the elements is nondeterministic so
>> the order in which elements are created and the order in which an
>> algorithm would "iterate" through such an array would never be the same
>> except for unlikely accidents.
>>
>> The way things are iterated now is with foreach which is also
>> nondeterministic in the way it picks the order, but it roughly matches
>> the order of production.
>>
>> >
>> > I think it's really worth considering, because for a lot of standard
>> > library functions, bad or confusing things will happen if arguments are
>> > sparse arrays.  Of course we can document all of these functions to warn
>> > users about doing that, but that requires them to a) actually read the
>> docs
>> > b) understand the distinction c) remember this when they're writing code
>> > and d) be able to trace weird deadlocks or error messages back to the
>> > original cause when they make an inadvertent mistake.  I don't mean
>> this as
>> > an insult to users - I could see myself doing all of those things.  It
>> > seems very suboptimal.
>>
>> Can you be more specific about "bad or confusing things"? One obvious
>> thing is that a piece of code will access an array element that no other
>> piece of code will ever write. I think this can be detected at array
>> closing time. The same applies to dense arrays: you cannot in general
>> guarantee that all elements in a dense array will be assigned. So you
>> are back to square one.  In order to avoid deadlocks, you would still
>> need to track all writes and determine when no more writes to the sparse
>> array will happen (essentially the current closing semantics) and then
>> check if any readers are blocked on the array.
>>
>> >
>> > This is really a separate discussion that doesn't affect the standard
>> > library, but I think the updateable idea, at least in the form that was
>> > proposed on the phone call, is flawed and actually is equivalent to
>> > changing Swift from a language with parallel semantics to a language
>> with
>> > sequential semantics that we're trying to auto-parallelise.  I don't
>> think
>> > we want to try to solve the autoparallelisation problem.
>>
>> That is what we are doing IMO and we are letting futures do the work for
>> us. An elegant solution that avoids complex compilers to a large extent.
>> I do, however, agree that we should not try to solve the problem of
>> auto-parallelizing C-like languages unless it turns out that we can do
>> so *easily* and there is a reasonable justification for it.
>>
>> >  It's easy to
>> > autoparallelise trivial examples but as soon as a programmer uses
>> something
>> > exotic like an if statement (or an if statement inside a loop, or many
>> more
>> > possibilities) it becomes much harder.  I can provide examples that will
>> > break any specific approach if needed.  For example, the "versioned"
>> > variables idea doesn't work for this code:
>> >
>> > versioned int x = 1;
>> >
>> > if (...) {
>> >   x = 2;
>> > }
>> >
>> > trace(x);
>> >
>> > Does x at the bottom refer to version 0 of the variable or version 1 of
>> the
>> > variable?  You can't determine that statically.  You might say that you
>> > could define the semantics so that the version x it's referring to
>> depends
>> > on which branch is taken at runtime.
>>
>> Perhaps. Your example reminds me of the issue of keeping track of the
>> number of writers.
>>
>> int x = 1;
>>
>> if (...) {
>>   x = 2;
>> }
>>
>> After how many writes do you close x?
>>
>> This is currently solved by the insertion of a balancing else branch:
>>
>> if (...) {
>>   x = 2;
>> }
>> else {
>>   partialClose(x);
>> }
>>
>> In other words, the compiler keeps track of how many writes are in both
>> branches and inserts closing statements to bring them in sync. I don't
>> see why that couldn't be used for other purposes:
>>
>> if (...) {
>>   x = 2;
>> }
>> else {
>>   x = x;
>> }
>>
>> It is in a sense what your example code means: "if the condition is
>> true, make x two, otherwise implicitly let x be the same as before".
>>
>>
>> >   But then you've reinvented sequential
>> > control flow and all that entails.  Once you get loops involved it is
>> much
>> > worse too - it's probably not that hard to come up with an extension
>> that
>> > works ok for this specific example, but adding loops to the equation
>> breaks
>> > it further.
>>
>> Maybe. Or maybe it's in the same class of problems as the writes. I'm
>> not sure. I was floating the idea by.
>>
>> How about just providing function pointers and reduce()? Solves min/max,
>> sum, avg, etc. Does not solve finding all sub-strings in a string.
>>
>> Mihael
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/swift-devel/attachments/20150121/8c19856c/attachment.html>