[Swift-devel] Error messages and execution retries

Ben Clifford benc at hawaga.org.uk
Wed Aug 18 15:54:49 CDT 2010


> Retries are meant to deal with transient errors, where transient is
> pretty much defined as "eventually stops happening if you retry enough
> times". The determination of whether they are transient or not (to a
> certain degree of confidence) requires that the operations are retried.

Right.

Sometimes tehre are transient errors. Sometimes there are not.

The theory of distributed computing likes to talk about transient errors 
and how they can be dealt with this way. But its not clear to me in 
practice how much that happens - my gut feeling from when I ran stuff was 
that most errors were non-transient and retries happened rarely. But I 
have no numerical evidence. That numerical evidence (either way) is 
probably the decider for retries.

> A skilled person could perhaps, by looking at the error, be able to make
> a quicker determination. But then the same skilled person would probably
> be able to set retries to 0 if he/she wanted to debug.

A skilled person equally well could turn retries on.

This thread is starting to sound pretty much like a complaint people have 
about condor where rather than failing a job, it will keep trying over and 
over. A 'skilled person' knows how and where to look to see wahts going 
on. A non-skilled person sees their job go into the queue and never 
complete.

> A normal user (who doesn't care about the details) may be disturbed by
> the printing of an error message that could be solved by retries. So I
> don't think that's necessarily the right choice unless it is made very
> clear, in the error message, that the task will be retried.

I agree with what I think you are saying, which is that error messages 
shouldn't be printed if they are not terminal.

-- 




More information about the Swift-devel mailing list