[Swift-devel] Re: Needs for site selection and job scheduling enhancements

Thu Feb 3 11:13:31 CST 2011

On Wed, 2011-02-02 at 18:31 -0600, Allan Espinosa wrote:
> 2011/2/1 Mihael Hategan <hategan at mcs.anl.gov>:

> I wonder if site independence only works when your workflow is
> compute-intensive.  What about a mechanism where you can checkpoint
> the site scores from other runs of a workflow?   But that would be
> available for all the jobs in a site.
> 
> I guess we could make a 1 site catalog per 1 app entry in the
> transformation catalog and do the 'hinting' at that level.

Or augment the site catalog to contain app-specific biases.

> 
> >>
> > [...]
> >> And we might need a feature (perhaps a swift.properties setting) to
> >> tell Swift to defer initial scheduling decisions for N seconds or
> >> until J jobs have been queued by the script, so that a sufficiently
> >> large number of jobs are in the queue before scheduling decisions are
> >> made (probably delay for say a minute on a multi-hour script run).
> >
> > How would that help? Given that the scheduling is probabilistic, that
> > makes the distribution essentially the same whether you have N or N/2
> > jobs.
> 
> Here is what I think the motivation for this feature:  Given a
> workflow with jobs grouped into m.  Each group has {n_1, n_2, n_3,
> ..., n_m} jobs.  Each group has a common data {d_1, d_2, ..., d_m}.
> 
> Then let us say that n_1 > n_2 > n_3 > ... > n_m .  From here, we say
> that scheduling group m on multiple sites does not make sense

That's a strong statement. If that one site is busy enough that would
cause additional jobs to take longer without data staging than it would
take them to run on a different site with staging, then it would make
sense.

>  since
> there is only a few jobs that share a data.  it would be better to
> bundle the jobs in group m into a single site.  I wonder how you can
> factor that in the probablistic scores.

Bias based on data locality. We had a student that did some preliminary
work there, but it never really made it in.

However I now see what Mike meant, and that is a windowing algorithm for
deciding that bias. But I don't think that's ultimately necessary. I
think a probabilistic approach would work ok without the need for a
delay.

> 
> >>
> >> In addition, we're wondering how easy (and desirable) any/all of the
> >> following language extensions could be done:
> >>
> >> - select statement to work on string values and/or ranges
> >
> > What would be the semantics of this statement? Can you give examples?
> >>
> >> - elseif clause to achieve the above in a multi-branch if statement
> >
> > Quite silly we don't support that already.
> 
> At least officially in the documentation, it says we don't support it.

I really have no idea whether this works or not, but if it doesn't it's
silly.