[Swift-devel] too much slow down.

Mihael Hategan hategan at mcs.anl.gov
Tue Jul 15 10:58:33 CDT 2008


We need to classify errors into transients and non-transients.

On Tue, 2008-07-15 at 15:45 +0000, Ben Clifford wrote:
> ok. I'll put in a user-adjustable parameter to adjust this which you will 
> be able to set to get pretty much the previous behaviour back.
> 
> On Tue, 15 Jul 2008, Michael Andric wrote:
> 
> > ...often for hours 'til we kill it.
> > 
> > and by that, i mean eternity.  i've let things go overnight to see what
> > would happen and it's still just hanging when i check it in the morning.
> > 
> > On Tue, Jul 15, 2008 at 10:35 AM, <skenny at uchicago.edu> wrote:
> > 
> > > so andric and i have been doing lots of runs the past week
> > > with the latest swift. we've definitely noticed a lack of
> > > errors from swift. that is, when it can't get a job thru it
> > > hangs...often for hours 'til we kill it.
> > >
> > > yesterday my job hung for about 20min so i
> > > killed it and tried running it with the previous version of
> > > swift. right away i got an error saying that the job was
> > > having trouble creating a directory on the remote site (which
> > > was in fact a correct error, there was a problem with the
> > > permissions).
> > >
> > > my personal vote would be for faster failures. i guess it's
> > > also worth mentioning that we rarely (read: never) run
> > > multi-site...mostly bcs we need to separate debugging our
> > > workflows from debugging our sites :)
> > >
> > >
> > > ---- Original message ----
> > > >Date: Mon, 14 Jul 2008 11:52:22 -0500
> > > >From: Mihael Hategan <hategan at mcs.anl.gov>
> > > >Subject: Re: [Swift-devel] too much slow down.
> > > >To: Ben Clifford <benc at hawaga.org.uk>
> > > >Cc: swift-devel at ci.uchicago.edu
> > > >
> > > >On Mon, 2008-07-14 at 16:37 +0000, Ben Clifford wrote:
> > > >> With the recent changes made to the scheduler to deal with
> > > bad sites in a
> > > >> multisite run, the behaviour in the presence of a single
> > > bad site and no
> > > >> good sites seems to be that a run will sit for a very long
> > > time rather
> > > >> than the previous behaviour of failing pretty fast.
> > > >>
> > > >> This is perhaps unpleasant, perhaps not; but its a
> > > significant change to
> > > >> behaviour.
> > > >
> > > >Isn't this what we wanted?
> > > >
> > > >>
> > > >
> > > >_______________________________________________
> > > >Swift-devel mailing list
> > > >Swift-devel at ci.uchicago.edu
> > > >http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
> > >
> > 




More information about the Swift-devel mailing list