[Swift-devel] excessive rate throttling for apparently temporally-restricted failures

Mihael Hategan hategan at mcs.anl.gov
Sun Oct 28 15:58:06 CDT 2007


On Sun, 2007-10-28 at 15:05 -0500, Ioan Raicu wrote:
> But my argument was, and still is, if there is only one site to submit
> to, changing situations are almost irrelevant, as there are no options
> anyhow.  Give me one example, where you have only 1 site, set X and Y
> properly, yet you need site scores as an additional throttling
> mechanism!

Of course it doesn't strictly apply in the one site case.

However, the idea of a single simple algorithm that can both deal with
multiple sites and can adjust things in the one site case sounds
appealing. And since self adjusting processes tend to have a feedback
loop in most cases, I'm led to believe that the possibility exists.

> 
> Mihael Hategan wrote: 
> > On Sun, 2007-10-28 at 11:23 -0500, Ioan Raicu wrote:
> >   
> > > I mentioned 2 throttling mechanisms, one is to have X outstanding jobs
> > > at any given time (limits jobs in the queue), and Y jobs/sec
> > > submit rate (limits the rate of submission).  I believe both of these
> > > throttling mechanisms could exist without computing site scores,
> > > assuming the user knows what to set X and Y to.
> > >     
> > 
> > They do exist, but they don't deal with asymmetries between sites. Nor
> > do they deal with changing situations.
> > 
> >   
> > > Ioan
> > > 
> > > Mihael Hategan wrote: 
> > >     
> > > > On Sun, 2007-10-28 at 10:25 -0500, Ioan Raicu wrote:
> > > >   
> > > >       
> > > > > Assuming you have a single site to submit to, then I don't see why you
> > > > > don't want to disable the site scoring altogether?
> > > > >     
> > > > >         
> > > > Because having too many jobs on that one site may still cause problems.
> > > > 
> > > > That said, the algorithm currently there needs some work.
> > > > 
> > > >   
> > > >       
> > > > > Of course you still want throttling, but that is more on the level
> > > > > of X outstanding jobs at any given time (and possibly Y jobs/sec
> > > > > submit rate), so you don't overrun the LRM, but you would not want to
> > > > > lower X to some low value just because some jobs are failing.  Again,
> > > > > once you go to multi-site runs, you need the site scoring to decide
> > > > > among the different sites, but with a single site, I see no drawbacks
> > > > > to disabling the site scoring mechanism.  
> > > > > 
> > > > > Ioan
> > > > > 
> > > > > Ben Clifford wrote: 
> > > > >     
> > > > >         
> > > > > > On Sun, 28 Oct 2007, Ioan Raicu wrote:
> > > > > > 
> > > > > >   
> > > > > >       
> > > > > >           
> > > > > > > they were due to the stale NFS handle error.  I think Mihael outlined in an
> > > > > > > email a while back how to disable the task submission throttling due to a bad
> > > > > > > score, assuming that you have a single site to submit to anyways. 
> > > > > > >     
> > > > > > >         
> > > > > > >             
> > > > > > I know how to disable it. I don't particularly want it running rate free.
> > > > > > 
> > > > > > Whats happening here is that the feedback loop feeding back too much / too 
> > > > > > fast for the situation I experience.
> > > > > > 
> > > > > > There's plenty of fun to be had experimenting there; and I suspect there 
> > > > > > will be no One True Rate Controller.
> > > > > > 
> > > > > >   
> > > > > >       
> > > > > >           
> > > > > -- 
> > > > > ============================================
> > > > > Ioan Raicu
> > > > > Ph.D. Student
> > > > > ============================================
> > > > > Distributed Systems Laboratory
> > > > > Computer Science Department
> > > > > University of Chicago
> > > > > 1100 E. 58th Street, Ryerson Hall
> > > > > Chicago, IL 60637
> > > > > ============================================
> > > > > Email: iraicu at cs.uchicago.edu
> > > > > Web:   http://www.cs.uchicago.edu/~iraicu
> > > > >        http://dsl.cs.uchicago.edu/
> > > > > ============================================
> > > > > ============================================
> > > > > _______________________________________________
> > > > > Swift-devel mailing list
> > > > > Swift-devel at ci.uchicago.edu
> > > > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
> > > > >     
> > > > >         
> > > > 
> > > >       
> > > -- 
> > > ============================================
> > > Ioan Raicu
> > > Ph.D. Student
> > > ============================================
> > > Distributed Systems Laboratory
> > > Computer Science Department
> > > University of Chicago
> > > 1100 E. 58th Street, Ryerson Hall
> > > Chicago, IL 60637
> > > ============================================
> > > Email: iraicu at cs.uchicago.edu
> > > Web:   http://www.cs.uchicago.edu/~iraicu
> > >        http://dsl.cs.uchicago.edu/
> > > ============================================
> > > ============================================
> > >     
> > 
> > 
> >   
> 
> -- 
> ============================================
> Ioan Raicu
> Ph.D. Student
> ============================================
> Distributed Systems Laboratory
> Computer Science Department
> University of Chicago
> 1100 E. 58th Street, Ryerson Hall
> Chicago, IL 60637
> ============================================
> Email: iraicu at cs.uchicago.edu
> Web:   http://www.cs.uchicago.edu/~iraicu
>        http://dsl.cs.uchicago.edu/
> ============================================
> ============================================




More information about the Swift-devel mailing list