<div dir="ltr"><div>Function pointers are probably not a short-term addition - might be simpler just to add functions we think are important and have function pointers as a longer-term addition.<br><br></div>- Tim<br></div><div class="gmail_extra"><br><div class="gmail_quote">On Wed, Jan 21, 2015 at 5:02 PM, Tim Armstrong <span dir="ltr"><<a href="mailto:tim.g.armstrong@gmail.com" target="_blank">tim.g.armstrong@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">I see your point with dense/sparse array errors - there is the prospect of confusing errors in either case. I think with dense arrays, the intent is clearer - the users knows they have to assign elements [0..N-1] and the function knows it can expect elements [0..N-1]. With sparse arrays the user has might readily expect the function to "do the right thing" with the input, since the types check out - getting back to the root cause of the problem would require more steps of inference to get back to the root problem - that the function doesn't do what they expect.<br><div><br>That transformation of adding an implicit else branch is really just
a way of implementing sequential semantics by adding data dependencies to implement the sequential control dependency. The
other example of closing an array is a pure data dependency on the
entire value of A rather than a control flow dependency.<br><br>I don't see Swift as a sequential language that we're autoparallelising though. For one, we let statements with side-effects execute out of order as a general rule. Constructs like foreach are also explicitly allowed to execute out of order. I.e. there is no total order in the program in which statements have to execute, there's just a partial order defined by data flow. To work out which version of a variable is being referred to requires defining a total order.<br><div><br>For example, if we involve loops:<br><div><div><div><div><div><br>versioned x = 1;<br><div><br>foreach i in A {<br></div><div> if (rand()) {<br></div><div> x = i;<br></div><div> }<br>}<br><br></div><div>I'm going to assume the intended behavior is that the loop iterations execute in some sequential order and each iteration depends on the previous. But this requires that a) the keys of A have a defined order, and b) we don't start iterating over A until we know what the "first" element is, i.e. essentially until A is closed.<br><br>But that doesn't really square with the idea that iterations of a foreach loop are independent that previously existed. This seems to require that the presence of a variable of a particular type changes the semantics of an enclosing control flow structure. This is really bizarre from a language design point of view.<br><br></div><div>I just think if we want sequential control flow we need to be realistic about the pros and cons that entails.<span class="HOEnZb"><font color="#888888"><br></font></span></div><span class="HOEnZb"><font color="#888888"><div><br></div><div>- Tim<br></div><div><br></div></font></span></div></div></div></div></div></div></div></div><div class="HOEnZb"><div class="h5"><div class="gmail_extra"><br><div class="gmail_quote">On Wed, Jan 21, 2015 at 3:55 PM, Mihael Hategan <span dir="ltr"><<a href="mailto:hategan@mcs.anl.gov" target="_blank">hategan@mcs.anl.gov</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span>On Wed, 2015-01-21 at 15:04 -0600, Tim Armstrong wrote:<br>
> I'm assuming that the size of a dense array would be specified when its<br>
> initialized. Then a dense array of size n behaves the same as n individual<br>
> variables - there's no reason you have to assign them in order, or iterate<br>
> over them in order unless that's what you actually want/need to do. I<br>
> agree that trying to have dense arrays that automatically expand in a<br>
> parallel language is a bad idea.<br>
<br>
</span>I think what I said was in reply to your question whether dense arrays<br>
would solve the problem. The answer is it would probably help, but it<br>
would not lead to ideal solutions. The point would have been that you<br>
can use them to emulate tail() by passing a starting index to a<br>
recursive function. You need a deterministic way of going through all<br>
the indices. Starting at 0 and incrementing is just the most obvious<br>
example. However, the creation of the elements is nondeterministic so<br>
the order in which elements are created and the order in which an<br>
algorithm would "iterate" through such an array would never be the same<br>
except for unlikely accidents.<br>
<br>
The way things are iterated now is with foreach which is also<br>
nondeterministic in the way it picks the order, but it roughly matches<br>
the order of production.<br>
<span><br>
><br>
> I think it's really worth considering, because for a lot of standard<br>
> library functions, bad or confusing things will happen if arguments are<br>
> sparse arrays. Of course we can document all of these functions to warn<br>
> users about doing that, but that requires them to a) actually read the docs<br>
> b) understand the distinction c) remember this when they're writing code<br>
> and d) be able to trace weird deadlocks or error messages back to the<br>
> original cause when they make an inadvertent mistake. I don't mean this as<br>
> an insult to users - I could see myself doing all of those things. It<br>
> seems very suboptimal.<br>
<br>
</span>Can you be more specific about "bad or confusing things"? One obvious<br>
thing is that a piece of code will access an array element that no other<br>
piece of code will ever write. I think this can be detected at array<br>
closing time. The same applies to dense arrays: you cannot in general<br>
guarantee that all elements in a dense array will be assigned. So you<br>
are back to square one. In order to avoid deadlocks, you would still<br>
need to track all writes and determine when no more writes to the sparse<br>
array will happen (essentially the current closing semantics) and then<br>
check if any readers are blocked on the array.<br>
<span><br>
><br>
> This is really a separate discussion that doesn't affect the standard<br>
> library, but I think the updateable idea, at least in the form that was<br>
> proposed on the phone call, is flawed and actually is equivalent to<br>
> changing Swift from a language with parallel semantics to a language with<br>
> sequential semantics that we're trying to auto-parallelise. I don't think<br>
> we want to try to solve the autoparallelisation problem.<br>
<br>
</span>That is what we are doing IMO and we are letting futures do the work for<br>
us. An elegant solution that avoids complex compilers to a large extent.<br>
I do, however, agree that we should not try to solve the problem of<br>
auto-parallelizing C-like languages unless it turns out that we can do<br>
so *easily* and there is a reasonable justification for it.<br>
<span><br>
> It's easy to<br>
> autoparallelise trivial examples but as soon as a programmer uses something<br>
> exotic like an if statement (or an if statement inside a loop, or many more<br>
> possibilities) it becomes much harder. I can provide examples that will<br>
> break any specific approach if needed. For example, the "versioned"<br>
> variables idea doesn't work for this code:<br>
><br>
> versioned int x = 1;<br>
><br>
> if (...) {<br>
> x = 2;<br>
> }<br>
><br>
> trace(x);<br>
><br>
> Does x at the bottom refer to version 0 of the variable or version 1 of the<br>
> variable? You can't determine that statically. You might say that you<br>
> could define the semantics so that the version x it's referring to depends<br>
> on which branch is taken at runtime.<br>
<br>
</span>Perhaps. Your example reminds me of the issue of keeping track of the<br>
number of writers.<br>
<span><br>
int x = 1;<br>
<br>
if (...) {<br>
x = 2;<br>
}<br>
<br>
</span>After how many writes do you close x?<br>
<br>
This is currently solved by the insertion of a balancing else branch:<br>
<span><br>
if (...) {<br>
x = 2;<br>
}<br>
</span>else {<br>
partialClose(x);<br>
}<br>
<br>
In other words, the compiler keeps track of how many writes are in both<br>
branches and inserts closing statements to bring them in sync. I don't<br>
see why that couldn't be used for other purposes:<br>
<span><br>
if (...) {<br>
x = 2;<br>
}<br>
</span>else {<br>
x = x;<br>
}<br>
<br>
It is in a sense what your example code means: "if the condition is<br>
true, make x two, otherwise implicitly let x be the same as before".<br>
<span><br>
<br>
> But then you've reinvented sequential<br>
> control flow and all that entails. Once you get loops involved it is much<br>
> worse too - it's probably not that hard to come up with an extension that<br>
> works ok for this specific example, but adding loops to the equation breaks<br>
> it further.<br>
<br>
</span>Maybe. Or maybe it's in the same class of problems as the writes. I'm<br>
not sure. I was floating the idea by.<br>
<br>
How about just providing function pointers and reduce()? Solves min/max,<br>
sum, avg, etc. Does not solve finding all sub-strings in a string.<br>
<span><font color="#888888"><br>
Mihael<br>
<br>
</font></span></blockquote></div><br></div>
</div></div></blockquote></div><br></div>