[Swift-devel] Imbalanced scheduling with coasters and multiple sites
Mihael Hategan
hategan at mcs.anl.gov
Tue Apr 7 00:39:14 CDT 2009
On Tue, 2009-04-07 at 00:33 -0500, Michael Wilde wrote:
>
> On 4/7/09 12:26 AM, Mihael Hategan wrote:
> > On Tue, 2009-04-07 at 00:15 -0500, Michael Wilde wrote:
> >> Note on below: I used 2hr30min as the time to match Glen's time, for the
> >> runs in which he first saw the "imbalance".
> >>
> >> In my first tests,I had used 5 min for coasterWorkerMaxwalltime and
> >> specified no site or tc maxwalltime. I thought that would work, based on
> >> our earlier lengthy exchanges on this topic. But apparantly coasters was
> >> calculating some default max walltime for "cat" and it gave me an error
> >> about insufficient time.
> >
> > Right. Previously it would just loop starting workers and then not using
> > them because they didn't have enough time. The default walltime is 10
> > minutes.
>
> That makes sense then. The error I got was:
>
> 2009-04-06 20:52:35,397-0500 DEBUG vdl:execute2 APPLICATION_EXCEPTION
> jobid=cat-e3agg19j - Application exception: Job cannot be run with the
> given max walltime worker constraint
>
> The other few anomalies I saw I will ignore unless they happen again, as
> I was using the bad 3/31 revision. This was things like starting a new
> service with some strange default max time ("01:41:00" or 101 minutes)
Not strange. 101 = 10 * 10 + 1 or DEFAULT_MAXWALLTIME *
OVERALLOCATION_FACTOR + RESERVE.
> after the initial services were started with the correct time, and some
> strange error retry behavior.
>
> Bear with me - these things are very difficult and tedious to report.
No problem. I'm glad you're exercising the code.
More information about the Swift-devel
mailing list