[Swift-devel] scheduler stuff for Google Summer of Code 2009

Ben Clifford benc at hawaga.org.uk
Wed Feb 11 14:41:41 CST 2009


On Wed, 11 Feb 2009, Michael Wilde wrote:

> - scaling swift to 1M+ task workflows, efficiently (streaming the 
> mappers)

There's more to this than simple streaming mappers.

At the moment, everything is built around having a Java object in memory 
for every piece of data that can be referenced, and that object tends to 
stick around for a long time (at least as long as that data can be 
referenced). For example, if you have an array which has a large number of 
elements, then each of those elements has at least one object in memory 
representing it, because as long as you have the array in scope, you can 
say a[1] or a[anything] and thus get to every element.

The in-memory implementation of the data model and anything that touches 
it would need some fairly serious work to cope with having stuff kept out 
of core; and I think keeping stuff out of core is something that would 
need to happen.

(that is, 'streaming mappers' as a phrase seems to deal with "not getting 
knowledge about data too fast" but does not deal with "forgetting 
knowledge about data fast enough")

-- 



More information about the Swift-devel mailing list