[Swift-devel] several alternatives to design the data management system for Swift on SuperComputers
Mihael Hategan
hategan at mcs.anl.gov
Mon Dec 1 17:00:05 CST 2008
On Mon, 2008-12-01 at 22:45 +0000, Ben Clifford wrote:
> > Design Alternatives:
> > 1. Data aware task scheduling:
> > Both swift and falkon need to be data aware. Swift should know where the
> > output of 1st stage is, which
> > means, which pset, or say which falkon service.
> > And the falkon service should know which CN has the data for the 2nd stage
> > computation.
>
> Swift *is* data aware. However it models things at the site level, not at
> a worker node level.
There's nothing stopping us from giving a certain element of the url
path a "node" semantic meaning.
> This is true at the moment:
>
> > Swift should know where theoutput of 1st stage is, which means, which
> > pset, or say which falkon service.
>
> There was talk before of having some data-affinity in the swift scheduler,
> which would mean that jobs would prefer (but perhaps not be guaranteed) to
> run on a site which already had their input data. I don't know if anyone
> did any coding towards this - I haven't seen an implementation.
I have some code from Ragib which I have yet to commit to SVN.
>
> In the pset = site case, which is how BG/P is being used at the moment,
> this would at least tend to keep execution on the same site as
>
> At the moment, Falkon doesn't know about input and output files for Swift
> jobs, so can't act on that information to influence its scheduling.
>
>
> > 2. Swift patch jobs vertically
> > Before sending out any jobs, swift knows those 2 stage jobs has data
> > dependency, thus send out 1 batched
> > job as 1 to each worker.
>
> VDS had some clustering capability like this. It seems quite interesting
> to think about.
VDS did full graph scheduling, unlike Swift.
More information about the Swift-devel
mailing list