[Swift-devel] Support request: Swift jobs flooding uc-teragrid?

Ian Foster foster at mcs.anl.gov
Wed Jan 30 08:31:18 CST 2008


Just to check--before we do all this, have we tried running with GRAM4?

Ben Clifford wrote:
> On Wed, 30 Jan 2008, Ti Leggett wrote:
>
>   
>> As a site admin I would rather you ramp up and not throttle down. Starting
>> high and working to a lower number means you could kill the machine many times
>> before you find the lower bound of what a site can handle. Starting slowly and
>> ramping up means you find that lower bound once. From my point of view, one
>> user consistently killing the resource can be turned off to prevent denial of
>> service to all other users *until* they can prove they won't kill the
>> resource. So I prefer the conservative.
>>     
>
> The code does ramp up at the moment, starting with 6 simultaneous jobs by 
> default.
>
> What doesn't happen very well at the moment is automated detection of 'too 
> much' in order to stop ramping up - the only really good feedback at the 
> moment (not just in this particular case but in other cases before) seems 
> to be a human being sitting in the feedback loop tweaking stuff.
>
> Two things we should work on are:
>  i) making it easier for the human who is sitting in that loop
> and
>  ii) figuring out a better way to get automated feedback.
>
> >From a TG-UC perspective, for example, what is a good way to know 'too 
> much'? Is it OK to keep submitting jobs until they start failing? Or 
> should there be some lower point at which we stop?
>
>   
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/swift-devel/attachments/20080130/8635441f/attachment.html>


More information about the Swift-devel mailing list