[Swift-user] Reduction trees

Mihael Hategan hategan at mcs.anl.gov
Mon May 19 16:54:27 CDT 2014


On Mon, 2014-05-19 at 21:40 +0000, Bronevetsky, Greg wrote:
> I'm actually not sure how to write such code in Swift so let me
> describe the task I am interested in. I have an array of files that
> contain the results of individual computation tasks. I wish to
> aggregate these files into one summary file. Swift makes it easy to
> pass the entire array to a single task to aggregate. However, with
> many files I reach the limit of the command line length constraints
> and further, the aggregator task takes too long. As such, I need to
> create a reduction tree, where the first n files are aggregated by one
> task, the next n by another and so on. Then the results of these
> aggregation tasks are themselves aggregated and so on until I have a
> single file that contains the aggregation of all the tasks.

We have been talking about this issue since the early days of swift. One
idea suggested was to be able to declare certain apps as associative
reduction steps and do this splitting automatically, but it was
speculative and didn't quite materialize.

The command line arguments being too long, that can be addressed by
creating a list of files that you can pass to the reduction app instead
of doing it on the command line. I.e.:

string[] fnames;
foreach v, k in files {
  fnames[k] = filename(v);
}
file fnamesFile = writeData(fnames);

app (file result) reduce(file[] files, file fnamesFile) {
  reduce filename(fnamesFile) ...;
}

> 
> The code example below does this when the input data to my tasks is
> just a number range and the code recursively chops the range into
> sub-segments, performing an aggregation on each one. However, if the
> input data is more complex, I don't see a way in Swift to do the same
> thing. I can't use regular looping constructs to create an array of
> computation task inputs or array of computation task output files
> since I don't know how to operate on sub-arrays. I might be able to
> pull some trick where I take a complex loop nest that generates the
> computation tasks and push it deep inside the recursion below, forcing
> the loop nest within each leaf of the recursion to perform just the
> portion of the iteration space that corresponds to the range bounds of
> that leaf. However, even if this were expressible in Swift, it would
> be hard to read. That's all I can think of right now but any other
> suggestions would be welcome.

Well, ... I can probably write a hackish split function that you can use
with 0.94. Give me a few hours.

Mihael




More information about the Swift-user mailing list