[Swift-devel] execution.retries

Ben Clifford benc at hawaga.org.uk
Tue Jun 10 12:20:08 CDT 2008


On Tue, 10 Jun 2008, lixi at uchicago.edu wrote:

> >> Yes, I've seen that. My question is: do these lines mean 
> >> different execution retries?
> >
> >Yes
> 
> Then why were these different retries submitted the same 
> site? Coincidence or certainty?

Say you have two sites.

Site A always fails fast.
Site B accepts jobs normally.

You have three jobs to submit, job J, K, L, which take a long time to 
run.

at t=0
We submit jobs randomly to available sites:
Job J is submitted to site A.
Job K is submitted to site B.
Job L is submitted to site B.

t=1
Site B is busy executing job K, and L
Job J fails on site A. We look for somewhere to retry it. Site B has 0 
slots free. Site A has 2 slots free. We send the job to site A.

t=2
same happens.

t=3
same happens.

Now we have retried job J three times, and so the workflow ultimately 
fails.

t=1000
job K and job L complete successfully.

-- 




More information about the Swift-devel mailing list