[Swift-devel] replication/recall of jobs from slow queues

Mihael Hategan hategan at mcs.anl.gov
Fri May 2 17:38:34 CDT 2008


There's some code as of r1869 to deal with the situation. It is disabled
by default, but can be enabled through swift.properties.

In theory it works like this: if a job sits in a queue for more than
replication.min.queue.time and more than 3*average_queue_time (which is
measured from other jobs), then a second replica of the same job is
created. The process continues until one of the replicas gets to the
active state, after which all other jobs are canceled.

I didn't have time to test this much (given that it's not very easy to
test), so probably there will be problems.




More information about the Swift-devel mailing list