[Swift-devel] excessive rate throttling for apparently temporally-restricted failures
Mihael Hategan
hategan at mcs.anl.gov
Sun Oct 28 15:58:06 CDT 2007
On Sun, 2007-10-28 at 15:05 -0500, Ioan Raicu wrote:
> But my argument was, and still is, if there is only one site to submit
> to, changing situations are almost irrelevant, as there are no options
> anyhow. Give me one example, where you have only 1 site, set X and Y
> properly, yet you need site scores as an additional throttling
> mechanism!
Of course it doesn't strictly apply in the one site case.
However, the idea of a single simple algorithm that can both deal with
multiple sites and can adjust things in the one site case sounds
appealing. And since self adjusting processes tend to have a feedback
loop in most cases, I'm led to believe that the possibility exists.
>
> Mihael Hategan wrote:
> > On Sun, 2007-10-28 at 11:23 -0500, Ioan Raicu wrote:
> >
> > > I mentioned 2 throttling mechanisms, one is to have X outstanding jobs
> > > at any given time (limits jobs in the queue), and Y jobs/sec
> > > submit rate (limits the rate of submission). I believe both of these
> > > throttling mechanisms could exist without computing site scores,
> > > assuming the user knows what to set X and Y to.
> > >
> >
> > They do exist, but they don't deal with asymmetries between sites. Nor
> > do they deal with changing situations.
> >
> >
> > > Ioan
> > >
> > > Mihael Hategan wrote:
> > >
> > > > On Sun, 2007-10-28 at 10:25 -0500, Ioan Raicu wrote:
> > > >
> > > >
> > > > > Assuming you have a single site to submit to, then I don't see why you
> > > > > don't want to disable the site scoring altogether?
> > > > >
> > > > >
> > > > Because having too many jobs on that one site may still cause problems.
> > > >
> > > > That said, the algorithm currently there needs some work.
> > > >
> > > >
> > > >
> > > > > Of course you still want throttling, but that is more on the level
> > > > > of X outstanding jobs at any given time (and possibly Y jobs/sec
> > > > > submit rate), so you don't overrun the LRM, but you would not want to
> > > > > lower X to some low value just because some jobs are failing. Again,
> > > > > once you go to multi-site runs, you need the site scoring to decide
> > > > > among the different sites, but with a single site, I see no drawbacks
> > > > > to disabling the site scoring mechanism.
> > > > >
> > > > > Ioan
> > > > >
> > > > > Ben Clifford wrote:
> > > > >
> > > > >
> > > > > > On Sun, 28 Oct 2007, Ioan Raicu wrote:
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > > they were due to the stale NFS handle error. I think Mihael outlined in an
> > > > > > > email a while back how to disable the task submission throttling due to a bad
> > > > > > > score, assuming that you have a single site to submit to anyways.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > I know how to disable it. I don't particularly want it running rate free.
> > > > > >
> > > > > > Whats happening here is that the feedback loop feeding back too much / too
> > > > > > fast for the situation I experience.
> > > > > >
> > > > > > There's plenty of fun to be had experimenting there; and I suspect there
> > > > > > will be no One True Rate Controller.
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > --
> > > > > ============================================
> > > > > Ioan Raicu
> > > > > Ph.D. Student
> > > > > ============================================
> > > > > Distributed Systems Laboratory
> > > > > Computer Science Department
> > > > > University of Chicago
> > > > > 1100 E. 58th Street, Ryerson Hall
> > > > > Chicago, IL 60637
> > > > > ============================================
> > > > > Email: iraicu at cs.uchicago.edu
> > > > > Web: http://www.cs.uchicago.edu/~iraicu
> > > > > http://dsl.cs.uchicago.edu/
> > > > > ============================================
> > > > > ============================================
> > > > > _______________________________________________
> > > > > Swift-devel mailing list
> > > > > Swift-devel at ci.uchicago.edu
> > > > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
> > > > >
> > > > >
> > > >
> > > >
> > > --
> > > ============================================
> > > Ioan Raicu
> > > Ph.D. Student
> > > ============================================
> > > Distributed Systems Laboratory
> > > Computer Science Department
> > > University of Chicago
> > > 1100 E. 58th Street, Ryerson Hall
> > > Chicago, IL 60637
> > > ============================================
> > > Email: iraicu at cs.uchicago.edu
> > > Web: http://www.cs.uchicago.edu/~iraicu
> > > http://dsl.cs.uchicago.edu/
> > > ============================================
> > > ============================================
> > >
> >
> >
> >
>
> --
> ============================================
> Ioan Raicu
> Ph.D. Student
> ============================================
> Distributed Systems Laboratory
> Computer Science Department
> University of Chicago
> 1100 E. 58th Street, Ryerson Hall
> Chicago, IL 60637
> ============================================
> Email: iraicu at cs.uchicago.edu
> Web: http://www.cs.uchicago.edu/~iraicu
> http://dsl.cs.uchicago.edu/
> ============================================
> ============================================
More information about the Swift-devel
mailing list