[Swift-devel] excessive rate throttling for apparently temporally-restricted failures

Mihael Hategan hategan at mcs.anl.gov
Sun Oct 28 14:51:31 CDT 2007


On Sun, 2007-10-28 at 11:23 -0500, Ioan Raicu wrote:
> I mentioned 2 throttling mechanisms, one is to have X outstanding jobs
> at any given time (limits jobs in the queue), and Y jobs/sec
> submit rate (limits the rate of submission).  I believe both of these
> throttling mechanisms could exist without computing site scores,
> assuming the user knows what to set X and Y to.

They do exist, but they don't deal with asymmetries between sites. Nor
do they deal with changing situations.

> 
> Ioan
> 
> Mihael Hategan wrote: 
> > On Sun, 2007-10-28 at 10:25 -0500, Ioan Raicu wrote:
> >   
> > > Assuming you have a single site to submit to, then I don't see why you
> > > don't want to disable the site scoring altogether?
> > >     
> > 
> > Because having too many jobs on that one site may still cause problems.
> > 
> > That said, the algorithm currently there needs some work.
> > 
> >   
> > > Of course you still want throttling, but that is more on the level
> > > of X outstanding jobs at any given time (and possibly Y jobs/sec
> > > submit rate), so you don't overrun the LRM, but you would not want to
> > > lower X to some low value just because some jobs are failing.  Again,
> > > once you go to multi-site runs, you need the site scoring to decide
> > > among the different sites, but with a single site, I see no drawbacks
> > > to disabling the site scoring mechanism.  
> > > 
> > > Ioan
> > > 
> > > Ben Clifford wrote: 
> > >     
> > > > On Sun, 28 Oct 2007, Ioan Raicu wrote:
> > > > 
> > > >   
> > > >       
> > > > > they were due to the stale NFS handle error.  I think Mihael outlined in an
> > > > > email a while back how to disable the task submission throttling due to a bad
> > > > > score, assuming that you have a single site to submit to anyways. 
> > > > >     
> > > > >         
> > > > I know how to disable it. I don't particularly want it running rate free.
> > > > 
> > > > Whats happening here is that the feedback loop feeding back too much / too 
> > > > fast for the situation I experience.
> > > > 
> > > > There's plenty of fun to be had experimenting there; and I suspect there 
> > > > will be no One True Rate Controller.
> > > > 
> > > >   
> > > >       
> > > -- 
> > > ============================================
> > > Ioan Raicu
> > > Ph.D. Student
> > > ============================================
> > > Distributed Systems Laboratory
> > > Computer Science Department
> > > University of Chicago
> > > 1100 E. 58th Street, Ryerson Hall
> > > Chicago, IL 60637
> > > ============================================
> > > Email: iraicu at cs.uchicago.edu
> > > Web:   http://www.cs.uchicago.edu/~iraicu
> > >        http://dsl.cs.uchicago.edu/
> > > ============================================
> > > ============================================
> > > _______________________________________________
> > > Swift-devel mailing list
> > > Swift-devel at ci.uchicago.edu
> > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
> > >     
> > 
> > 
> >   
> 
> -- 
> ============================================
> Ioan Raicu
> Ph.D. Student
> ============================================
> Distributed Systems Laboratory
> Computer Science Department
> University of Chicago
> 1100 E. 58th Street, Ryerson Hall
> Chicago, IL 60637
> ============================================
> Email: iraicu at cs.uchicago.edu
> Web:   http://www.cs.uchicago.edu/~iraicu
>        http://dsl.cs.uchicago.edu/
> ============================================
> ============================================




More information about the Swift-devel mailing list