[Swift-devel] Re: replication vs site score

Mihael Hategan hategan at mcs.anl.gov
Wed Apr 8 15:44:44 CDT 2009


On Wed, 2009-04-08 at 13:38 -0700, Ioan Raicu wrote:
> Does a batch-queue prediction service help things in any way?
> https://portal.teragrid.org/gridsphere/gridsphere?cid=queue-prediction
> 
> I've always wondered how the Swift scheduler would behave differently
> if it had statistical information about queue times.

It would help. Statistically.

>  Qin, have you compared your job replication strategy with one that
> was cognizant of the expected wait queue time, in order to meet
> deadlines? On the surface, assuming that the batch queue prediction is
> accurate, it would seem that scheduling with known queue times might
> solve the same deadline cognizant scheduling problem, but without
> wasting resources by unnecessary replication.

The replication isn't unnecessary. If it starts it starts because the
queue time is larger than the expected queue time.

>  The place where the queue prediction doesn't help, is when there is a
> bad node which causes an application to be slow or fail.

No. The prediction doesn't help when it fails to predict accurately.

>  In this case, replication is probably the better recourse to
> guarantee meeting deadlines.





More information about the Swift-devel mailing list