[Swift-devel] Re: Request for control over throttle algorithm
Michael Wilde
wilde at mcs.anl.gov
Mon Aug 27 16:15:20 CDT 2007
Mihael Hategan wrote:
> On Mon, 2007-08-27 at 15:07 -0500, Michael Wilde wrote:
>> [changing subject line to start a new thread]
>>
>> Mihael, all,
>>
>> I'm observing again that Karajan job throttling algorithms need more
>> discussion, design and testing, and that in the meantime - and perhaps
>> always - we need simple ways to override the algorithms and manually
>> control the throttle.
>
> Here's what happens:
> 1. somebody says "I don't like throttling because it decreases the
> performance" (that's what throttles do, in order to make things not
> fail)
No. What was said was: We are trying to get a workflow running for a
real science user - on whose success we depend on. And in the process
of doing that, the current obstacle to good performance is a
failure-retry behavior that is not working well.
> 2. we collectively conclude that we should disable throttling
Several of us believe that in *this* case it will enable the workflow to
*finally* succeed and will also yield better performance. Note that the
default settings do not even let the workflow complete successully.
> 3. there are options to change those in swift.properties (and one in
> scheduler.xml which I will also add to swift.properties), and they are
> increased to "virtually off" numbers (I need to add an explicit "off" to
> make things easier)
This is great - just what we need. But I think Ioan cant find the prior
email in which you describe them, and I couldnt either. Could you
re-state what to set, please?
> 4. the workflows still don't work very well because there are lots of
> failures now, and quality drops
That would be a different scenario. In this case, Ioan will try to take
the offending node(s) out of service as seen by Falkon.
> 5. throttles are set back to reasonable values
Yes, thats the goal. I believe that automated failure handling is
difficult and takes a while - lots of design, measurement, test, improve
- before they work well. Certainly the internet and TCP/IP teaches us
that. Critical, necessary, but a long road.
> 6. maybe some things are changed (i.e. gram -> falkon), but
> fundamentally the problems are the same (different scales though)
> 7. GOTO 1
Yes, as often as needed. Its iteration, but not endless, if done
thoughtfully.
>
>> This is true for throttling both successful and failing jobs.
I agree.
>>
>> Right now MolDyn progress is being impeded by a situation where a single
>> bad cluster node (with stale FS file handles) has an unduly negative
>> impact on overall workflow performance.
>
> Yes. And this is how things work. There are problems. It's a statement
> of fact.
>
>> I feel that before we discuss and work on the nuances of throttling
>> algorithms (which will take some time to perfect) we should provide a
>> simple and reliable way for the user to override the default heuristics
>> and achieve good performance in situations that are currently occurring.
>
> Groovy. Would the above (all throttling parameters in swift.properties
> and the "off" option for each) work?
Yes, I think so - again, please (re)re-iterate what they are, please. :)
>
>> How much work it would take to provide a config parameter that causes
>> failed jobs to get retried immediately with no delay or scheduling
>> penalty? I.e., let the user set the "failure penalty" ratio to reduce or
>> eliminate the penalty for failures.
>
> I'd suggest simply not throttling on such things.
Agreed. Cool.
>
> There can also be an option for tweaking the factors, but I have at
> least one small adversion towards having too many things in
> swift.properties.
Sounds reasonable. Lets start with the basics.
Now, having said all this - perhaps Ioan can catch and retry the failure
all in falkon. Is wrapper.sh capable of getting re-run on a different
node of the same cluster? (If not I think we can enance it to be).
Thanks,
Mike
>
> Mihael
>
>> Its possible that once we have this control, we'd need a few other
>> parameters to make reasonable things happen in the case of running on
>> one or more Falkon sites.
>>
>> In tandem with this, Falkon will provide parameters to control what
>> happens to a node after a failure:
>> - a failure analyzer will attempt to recognize node failures as opposed
>> to app failures (some of this may need to go into the Swift launcher,
>> wrapper.sh
>> - on known node failures Falkon will log the failure to bring to
>> sysadmin attention, and will also leave the node held
>> - In the future falcon will add new nodes to compensate for nodes that
>> it has disabled.
>>
>> I'd like to ask that we focus discussion on what is needed to design and
>> implement these basic changes, and whether they would solve the current
>> problems and be useful in general.
>>
>> - Mike
>>
>>
>>
>>
>>
>> Mihael Hategan wrote:
>>> On Mon, 2007-08-27 at 13:25 -0500, Ioan Raicu wrote:
>>>> The question I am interested in, can you modify the heuristic to take
>>>> into account the execution time of tasks when updating the site score?
>>> I thought I mentioned I can.
>>>
>>>> I think it is important you use only the execution time (and not
>>>> Falkon queue time + execution time + result delivery time); in this
>>>> case, how does Falkon pass this information back to Swift?
>>> I thought I mentioned why that's not a good idea. Here's a short
>>> version:
>>> If Falkon is slow for some reason, that needs to be taken into account.
>>> Excluding it from measurements under the assumption that it will always
>>> be fast is not a particularly good idea. And if it is always fast then
>>> it doesn't matter much since it won't add much overhead.
>>>
>>>> Ioan
>>>>
>>>> Mihael Hategan wrote:
>>>>> On Mon, 2007-08-27 at 17:37 +0000, Ben Clifford wrote:
>>>>>
>>>>>> On Mon, 27 Aug 2007, Ioan Raicu wrote:
>>>>>>
>>>>>>
>>>>>>> On a similar note, IMO, the heuristic in Karajan should be modified to take
>>>>>>> into account the task execution time of the failed or successful task, and not
>>>>>>> just the number of tasks. This would ensure that Swift is not throttling task
>>>>>>> submission to Falkon when there are 1000s of successful tasks that take on the
>>>>>>> order of 100s of second to complete, yet there are also 1000s of failed tasks
>>>>>>> that are only 10 ms long. This is exactly the case with MolDyn, when we get a
>>>>>>> bad node in a bunch of 100s of nodes, which ends up throttling the number of
>>>>>>> active and running tasks to about 100, regardless of the number of processors
>>>>>>> Falkon has.
>>>>>>>
>>>>>> Is that different from when submitting to PBS or GRAM where there are
>>>>>> 1000s of successful tasks taking 100s of seconds to complete but with
>>>>>> 1000s of failed tasks that are only 10ms long?
>>>>>>
>>>>> In your scenario, assuming that GRAM and PBS do work (since some jobs
>>>>> succeed), then you can't really submit that fast. So the same thing
>>>>> would happen, but slower. Unfortunately, in the PBS case, there's not
>>>>> much that can be done but to throttle until no more jobs than good nodes
>>>>> are being run at one time.
>>>>>
>>>>> Now, there is the probing part, which makes the system start with a
>>>>> lower throttle which increases until problems appear. If this is
>>>>> disabled (as it was in the ModDyn run), large numbers of parallel jobs
>>>>> will be submitted causing a large number of failures.
>>>>>
>>>>> So this whole thing is close to a linear system with negative feedback.
>>>>> If the initial state is very far away from stability, there will be
>>>>> large transients. You're more than welcome to study how to make it
>>>>> converge faster, or how to guess the initial state better (knowing the
>>>>> number of nodes a cluster has would be a step).
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>> --
>>>> ============================================
>>>> Ioan Raicu
>>>> Ph.D. Student
>>>> ============================================
>>>> Distributed Systems Laboratory
>>>> Computer Science Department
>>>> University of Chicago
>>>> 1100 E. 58th Street, Ryerson Hall
>>>> Chicago, IL 60637
>>>> ============================================
>>>> Email: iraicu at cs.uchicago.edu
>>>> Web: http://www.cs.uchicago.edu/~iraicu
>>>> http://dsl.cs.uchicago.edu/
>>>> ============================================
>>>> ============================================
>>>
>
>
More information about the Swift-devel
mailing list