[Swift-user] Reduction trees

Bronevetsky, Greg bronevetsky1 at llnl.gov
Mon May 19 16:40:42 CDT 2014


I'm actually not sure how to write such code in Swift so let me describe the task I am interested in. I have an array of files that contain the results of individual computation tasks. I wish to aggregate these files into one summary file. Swift makes it easy to pass the entire array to a single task to aggregate. However, with many files I reach the limit of the command line length constraints and further, the aggregator task takes too long. As such, I need to create a reduction tree, where the first n files are aggregated by one task, the next n by another and so on. Then the results of these aggregation tasks are themselves aggregated and so on until I have a single file that contains the aggregation of all the tasks.

The code example below does this when the input data to my tasks is just a number range and the code recursively chops the range into sub-segments, performing an aggregation on each one. However, if the input data is more complex, I don't see a way in Swift to do the same thing. I can't use regular looping constructs to create an array of computation task inputs or array of computation task output files since I don't know how to operate on sub-arrays. I might be able to pull some trick where I take a complex loop nest that generates the computation tasks and push it deep inside the recursion below, forcing the loop nest within each leaf of the recursion to perform just the portion of the iteration space that corresponds to the range bounds of that leaf. However, even if this were expressible in Swift, it would be hard to read. That's all I can think of right now but any other suggestions would be welcome.

Greg Bronevetsky
Lawrence Livermore National Lab
(925) 424-5756
bronevetsky at llnl.gov
http://greg.bronevetsky.com


-----Original Message-----
From: Mihael Hategan [mailto:hategan at mcs.anl.gov] 
Sent: Monday, May 19, 2014 2:31 PM
To: Bronevetsky, Greg
Cc: swift-user at ci.uchicago.edu
Subject: Re: [Swift-user] Reduction trees

You are correct. There does not seem to be an easy way to slice a sparse array (or an array with non-int keys).

I've filed an enhancement report to add the relevant feature
(https://bugzilla.mcs.anl.gov/swift/show_bug.cgi?id=1275)

In the mean time, I'm trying to see if there might be a hack that can allow you to do what you need. Can you post the code that generates the sparse arrays you mention? I'm trying to picture a solution, but it's hard without a concrete example.

Mihael

On Mon, 2014-05-19 at 14:09 +0000, Bronevetsky, Greg wrote:
> What I mean is that I have an array of files computed by my individual 
> tasks. To do a reduction tree on them I need to create a sub-array of 
> the first 100, the second 100, etc. so that each sub-array can be 
> merged independently. In my code example below I skip this step and 
> simply generate a region of numbers (minR, maxR) for each node in my 
> reduction tree and then do stuff with the numbers. Now I need for 
> these numbers to correspond to indexes in an array of files.
> 
> Greg Bronevetsky
> Lawrence Livermore National Lab
> (925) 424-5756
> bronevetsky at llnl.gov
> http://greg.bronevetsky.com
> 
> -----Original Message-----
> From: Mihael Hategan [mailto:hategan at mcs.anl.gov]
> Sent: Sunday, May 18, 2014 5:05 PM
> To: Bronevetsky, Greg
> Cc: swift-user at ci.uchicago.edu
> Subject: Re: [Swift-user] Reduction trees
> 
> Hi,
> 
> Can you be more specific about what you mean by "subsets of keys" below?
> Specifically, how are these sub sets defined?
> 
> Mihael
> 
> On Sun, 2014-05-18 at 22:16 +0000, Bronevetsky, Greg wrote:
> > I need to implement a reduction three in Swift to aggregate the 
> > results of many individual runs. I've written the simple algorithm 
> > below, which works when all my runs correspond to numbers in a fixed 
> > range. However, in reality I have a regular or associative array of 
> > files produced by individual runs.  How do I adapt the code below to 
> > this scenario? I don't see any way to iterate over subsets of array 
> > indexes/keys. Thanks!
> > 
> > Greg Bronevetsky
> > Lawrence Livermore National Lab
> > (925) 424-5756
> > bronevetsky at llnl.gov<mailto:bronevetsky at llnl.gov>
> > http://greg.bronevetsky.com
> > 
> > 
> > type file;
> > 
> > app (file outFile) gen(string arg)
> > {
> >   echo arg stdout=@filename(outFile); }
> > 
> > app (file summaryFile) catBase(file inFiles[]) {
> >   cat @filenames(inFiles) stdout=@filename(summaryFile); }
> > 
> > // Concatenate the text of numbers in range [minR - maxR) (includes 
> > minR, not maxR) (file summaryFile) catTree(int minR, int maxR, int radix, int level) {
> >   file subFiles[];
> >  if((maxR-minR) <= radix) {
> >     //tracef("catTree: leaf: minR=%i, minR=%i\n", minR, maxR);
> >     foreach b in [0: (maxR-minR)-1] {
> >       //tracef("catTree: leaf: b=%i\n", b);
> >       file curLeafFile <single_file_mapper; file=@strcat("file.", at toString(minR+b))>;
> >       (curLeafFile) = gen(@toString(minR+b));
> >       subFiles[b] = curLeafFile;
> >     }
> >   } else {
> >     int size=maxR-minR;
> >     //tracef("catTree: node: minR=%i, minR=%i\n", minR, maxR);
> >     foreach b in [0: radix-1] {
> >       file curNodeFile <single_file_mapper; file=@strcat("node.level_", at toString(level),".startVal_", at toString(minR+b))>;
> >       int start = minR + (size*b)%/radix;
> >       int end   = minR + (size*(b+1))%/radix;
> >       //tracef("catTree: node: b=%i [%i - %i]\n", b, start, end);
> >       (curNodeFile) = catTree(start, end, radix, level+1);
> >       subFiles[b] = curNodeFile;
> >     }
> >   }
> >   //tracef("catBase: inFiles=%q\n", @filenames(subFiles));
> >   (summaryFile) = catBase(subFiles); }
> > 
> > 
> > file allFile <single_file_mapper; file="all">;
> > (allFile) = catTree(0, 31, 2, 0);
> > 
> > _______________________________________________
> > Swift-user mailing list
> > Swift-user at ci.uchicago.edu
> > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user
> 
> 




More information about the Swift-user mailing list