[Swift-devel] scheduler stuff for Google Summer of Code 2009
Mihael Hategan
hategan at mcs.anl.gov
Wed Feb 11 15:33:48 CST 2009
----- Ben Clifford <benc at hawaga.org.uk> wrote:
>
> On Wed, 11 Feb 2009, Michael Wilde wrote:
>
> > - scaling swift to 1M+ task workflows, efficiently (streaming the
> > mappers)
>
> There's more to this than simple streaming mappers.
>
> At the moment, everything is built around having a Java object in memory
> for every piece of data that can be referenced, and that object tends to
> stick around for a long time (at least as long as that data can be
> referenced). For example, if you have an array which has a large number of
> elements, then each of those elements has at least one object in memory
> representing it, because as long as you have the array in scope, you can
> say a[1] or a[anything] and thus get to every element.
I do not think that this issue is the bottleneck here. For every application
invocation there is a karajan thread. The fact that one such thread eats
around 10-20k seems to be the problem. By contrast, a piece of Swift data
probably takes less than 1k.
So I think that one order of magnitude improvement could be achieved by
addressing that 10-20k problem (or by somehow having fewer karajan threads).
>
> The in-memory implementation of the data model and anything that touches
> it would need some fairly serious work to cope with having stuff kept out
> of core; and I think keeping stuff out of core is something that would
> need to happen.
>
> (that is, 'streaming mappers' as a phrase seems to deal with "not getting
> knowledge about data too fast" but does not deal with "forgetting
> knowledge about data fast enough")
>
> --
> _______________________________________________
> Swift-devel mailing list
> Swift-devel at ci.uchicago.edu
> http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
More information about the Swift-devel
mailing list