[Swift-devel] several alternatives to design the data management system for Swift on SuperComputers

Mon Dec 1 17:00:05 CST 2008

On Mon, 2008-12-01 at 22:45 +0000, Ben Clifford wrote:
> > Design Alternatives:
> > 1. Data aware task scheduling:
> >    Both swift and falkon need to be data aware. Swift should know where the
> > output of 1st stage is, which
> >    means, which pset, or say which falkon service.
> >    And the falkon service should know which CN has the data for the 2nd stage
> > computation.
> 
> Swift *is* data aware. However it models things at the site level, not at 
> a worker node level.

There's nothing stopping us from giving a certain element of the url
path a "node" semantic meaning.

>  This is true at the moment:
> 
> > Swift should know where theoutput of 1st stage is, which means, which 
> > pset, or say which falkon service.
> 
> There was talk before of having some data-affinity in the swift scheduler, 
> which would mean that jobs would prefer (but perhaps not be guaranteed) to 
> run on a site which already had their input data. I don't know if anyone 
> did any coding towards this - I haven't seen an implementation.

I have some code from Ragib which I have yet to commit to SVN.

> 
> In the pset = site case, which is how BG/P is being used at the moment, 
> this would at least tend to keep execution on the same site as
> 
> At the moment, Falkon doesn't know about input and output files for Swift 
> jobs, so can't act on that information to influence its scheduling.
> 
> 
> > 2. Swift patch jobs vertically
> >    Before sending out any jobs, swift knows those 2 stage jobs has data
> > dependency, thus send out 1 batched
> >    job as 1 to each worker.
> 
> VDS had some clustering capability like this. It seems quite interesting 
> to think about.

VDS did full graph scheduling, unlike Swift.