[Swift-user] How to wait on functions that return no data?

Michael Wilde wilde at mcs.anl.gov
Tue Mar 25 10:45:55 CDT 2008


Your view has merits in terms of language purity, but I disagree with it.

This was posed as an academic question, and I think its interesting to 
discuss.

The point here is that there's an application that could best be done by 
batching up its output, and in fact perhaps by using the map-reduce 
representation of tuples for that output.

Its still driven by dataflow and data dependencies, just not the 
simplistic lock-step dependencies that swift implements today.

For example, one way to address the problem is to say that batching of 
function calls, the way swift does today, is helpful but ignores the 
problem that small tasks often have small data inputs and outputs, and 
that these should be batched along with the job execution.

That would leave swift language semantics unchanged, but the 
implementation would get more efficient and could handle finer-grained 
tasks.

An even more efficient and interesting approach, fully in keeping with 
the language as it stands today, would be to allow tuples to be 
expressed as inputs and outputs, and to have swift efficiently and 
automatically route (and batch) tuples in and out of jobs.

So I view what I was asking for here as a prototype or exploration of 
that direction.  It would be good to test the performance of an 
implementation that streamed output tuples into a subsequent ("reduce") 
stage of processing, before we even consider what the language and/or 
implementation would need to do for such a case.


On 3/25/08 10:23 AM, Mihael Hategan wrote:
...
 > Don't use Swift then. Seriously. If you don't want to express things in
 > a dataflow oriented way, and are not satisfied with its performance for
 > the given problem, don't use it.

I want to express things as dataflow, with high performance, in Swift.

Mike


On 3/25/08 10:23 AM, Mihael Hategan wrote:
> On Tue, 2008-03-25 at 10:14 -0500, Michael Wilde wrote:
>>>> In the example below, I want collectResults() to get invoked after all
>>  >> the runam() calls complete in doall().
>>  >
>>  > results = doall();
>>  > collectResults(results);
>>  >
>>  > Mihael
>>
>> But thats the problem: doall() does not in this example return results. 
> 
> Then it should be fixed.
> 
>> If it would return an artificial result, how would we get such a return 
>> to wait until all the runam() calls made within the freach() have completed?
>>
>> Each of the runam() call runs a small model, and in this proposed 
>> scenario would leave those results on a local disk for later collection, 
>> either in a single shared file that many invocations would append to, or 
>> in a set of files.
> 
> I don't think the solution to performance problems in Swift is to hack
> stuff like that.
> 
>> Then collectresults() would run a job that collects all the data when done.
>>
>> One approach can be to have collectresults() just run iteratively until 
>> it has collected a sufficient number of results.  I.e., to have it not 
>> depend on swift to find out when all the runam() calls have completed. 
>> That might work.
> 
> Don't use Swift then. Seriously. If you don't want to express things in
> a dataflow oriented way, and are not satisfied with its performance for
> the given problem, don't use it.
> 
> Mihael
> 
>> - Mike
>>
>>
>> On 3/25/08 10:00 AM, Mihael Hategan wrote:
>>> On Tue, 2008-03-25 at 09:46 -0500, Michael Wilde wrote:
>>>> For the petro-model app Im working on, it would be interesting to run 
>>>> the parameter sweep in "map reduce" manner, in which each invocation 
>>>> bites off a portion of the parameter space and processes it, resulting 
>>>> in a set of result tuples. Each run of the model will produce a set of 
>>>> tuples, and when all are done, we want to aggregate and plot the tuples.
>>>>
>>>> While with batching this is not strictly needed, it would be interesting 
>>>> to let the model results accumulate on the local filesystem (as in this 
>>>> case they are small) and collect them either at the end of the run, or 
>>>> periodically and perhaps asynchronously during the run.
>>>>
>>>> To do this, we'd want to write the model invocation as a swift function 
>>>> with only scalar numeric parameters, and no output.
>>> That assertion I'm not sure about.
>>>
>>>> The question is how to call a zero-returns function in a swift foreach() 
>>>> loop, and embed that foreach() in a function that doesnt return until 
>>>> all members of the foreach() have been processed.
>>> The very notion of "return" as it would appear in a strict language
>>> doesn't make much sense in Swift, so I'm not quite sure.
>>>
>>>> I havent tried to code this yet, because I cant think of a way to 
>>>> express it in swift, due to the data-dependency semantics.
>>>>
>>>> In the example below, I want collectResults() to get invoked after all 
>>>> the runam() calls complete in doall().
>>> results = doall();
>>> collectResults(results);
>>>
>>> Mihael
>>>
>>>> Anyone have any ideas?
>>>>
>>>> This is a low-priority question, just food for thought, as the batched 
>>>> way of running this parameter sweep should be straightforward and efficient.
>>>>
>>>> Mike
>>>>
>>>>
>>>>
>>>> // Amiga-Mars Parameter Sweep
>>>>
>>>> type amout;
>>>>
>>>> runam (string id , string p1, string p2) // no ret val
>>>> {
>>>>    app { runam3 id p1 p2 ; }
>>>> }
>>>>
>>>> type params {
>>>>    string id;
>>>>    string p1;
>>>>    string p2;
>>>> };
>>>>
>>>> doall(params p[])
>>>> {
>>>>    foreach pset in p {
>>>>      runam(pset.id, pset.p1, pset.p2);
>>>>    }
>>>>    // waitTillAllDone();
>>>>    // want to block here till all above finish,
>>>>    // but no data to wait on.  any way to
>>>>    // achieve this???
>>>> }
>>>>
>>>> // Main
>>>>
>>>> params p[];
>>>> p = readdata("paramlist");
>>>> doall(p);
>>>> amout amdata <some mapping>;
>>>> amdata = collectResults();
>>>>
>>>> // ^^^ Want collectresults to run AFTER all runam() calls finish
>>>> //     in the doall() function.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Swift-user mailing list
>>>> Swift-user at ci.uchicago.edu
>>>> http://mail.ci.uchicago.edu/mailman/listinfo/swift-user
>>>>
>>>
> 
> 



More information about the Swift-user mailing list