[Swift-devel] scheduler stuff for Google Summer of Code 2009

Michael Wilde wilde at mcs.anl.gov
Wed Feb 11 14:53:01 CST 2009


So in a sense, rather than saying "streaming mappers" can you call this 
"streaming foreach() statements" so that as each "iteration" (or 
"instantiation") of the foreach completes, the objects it used are freed 
and removed/removable from memory? (ie, does this address the 'scope" 
problem?)

Too big for an SOC student?

Interesting enough for one?

(Its a nice scalability challenge... and could be demonstrated first on 
localhost to make good progress w/o getting tangled in distributed 
computing initially)

If too big, can we make it manageble?

If too small, can we bundle with related tasks?

On 2/11/09 2:41 PM, Ben Clifford wrote:
> On Wed, 11 Feb 2009, Michael Wilde wrote:
> 
>> - scaling swift to 1M+ task workflows, efficiently (streaming the 
>> mappers)
> 
> There's more to this than simple streaming mappers.
> 
> At the moment, everything is built around having a Java object in memory 
> for every piece of data that can be referenced, and that object tends to 
> stick around for a long time (at least as long as that data can be 
> referenced). For example, if you have an array which has a large number of 
> elements, then each of those elements has at least one object in memory 
> representing it, because as long as you have the array in scope, you can 
> say a[1] or a[anything] and thus get to every element.
> 
> The in-memory implementation of the data model and anything that touches 
> it would need some fairly serious work to cope with having stuff kept out 
> of core; and I think keeping stuff out of core is something that would 
> need to happen.
> 
> (that is, 'streaming mappers' as a phrase seems to deal with "not getting 
> knowledge about data too fast" but does not deal with "forgetting 
> knowledge about data fast enough")
> 



More information about the Swift-devel mailing list