[Swift-devel] Re: scheduling

Mihael Hategan hategan at mcs.anl.gov
Wed Mar 31 11:16:04 CDT 2010


A few comments:

I think we need to simplify the model here, because it's already hard to
reason about. I think we also need to keep it cohesive. So whether
information provided by the mappers is used for scheduling, no
scheduling decisions (logic) should be in mappers.

On Wed, 2010-03-31 at 07:28 -0500, wilde at mcs.anl.gov wrote:
> To think through this question it helps to define the actors that are
> involved:
> 
> - input mappers, which have the opportunity to do their mapping from a
> replica catalog

Again, I do not believe that mappers should play a role in scheduling
because without other knowledge about sites they would do a poor job at
making decisions about where to run things. And if they made scheduling
decisions, then they would be logically part of the scheduler which
sounds fishy.

> 
> - the site selector in Swift's scheduler, which could factor into its
> criteria where the data needed by an application lives (and perhaps
> where the output must go, or where the user prefers that it go)
> 
> - the Swift execution wrapper _swiftwrap and/or the post-execution
> logic and cache management logic in Swift which can influence whether
> and how long a file stays in the site shared/ cache

Assuming that there is a shared cache. The trend we're seeing with EAGER
is that there is no shared directory any more, which was used solely as
a caching mechanism due to the fact that the client did the staging.





More information about the Swift-devel mailing list