[Swift-devel] Re: worker.pl IDLETIMEOUT

Mihael Hategan hategan at mcs.anl.gov
Fri Dec 10 22:38:27 CST 2010


On Fri, 2010-12-10 at 17:07 -0600, Allan Espinosa wrote:
> Looking at the worker.pl I use, yes there is no more IDLE timeout
> cases.  Then this will leave pilot jobs failing when it exceeds the
> maxwalltime.  This is another explanation for the large amount of job
> failures in OSG as well.

If the server dies, the worker should eventually die due to lack of
heartbeats. Theoretically. So I'm not sure what the circumstances are
that cause the maxwalltime to be exceeded. Can you give some more
details?

Mihael

> 
> Before the changes, I simply changed the IDLE timeout to exit cleanly
> (exit 0 instead of die)
> 
> -Allan
> 
> 2010/12/10 Michael Wilde <wilde at mcs.anl.gov>:
> > I added that idle timeout arg to worker.pl I think.  But in recent changes I think Mihael removed the idle timeout entirely.  Are you using a recent trunk version with those changes?  That seemed to work best for me in my latest tests using passive persistent coaster servers.
> >
> >
> >
> > ----- Original Message -----
> >> The idle timeout having a non-zero exitcode generated a lot of "JOB
> >> FAILED" stats in OSG . this skews their usage report in a weird
> >> fashion. I made some modifications before but my upgrade to the
> >> latest trunk code somehow broke it.
> >>
> >> 2010/10/12 Allan Espinosa <aespinosa at cs.uchicago.edu>:
> >> > Poking at worker.pl, I see that it accepts a third argument for idle
> >> > time. Is
> >> > this in seconds?
> >> >
> >> > Also, I'm using swift to driver a number of passive workers. The
> >> > worker jobs
> >> > fail due to this timeout. I may have to modify things to suit this
> >> > kind of
> >> > setup.
> >> >
> >> > Thanks,
> >> > -Allan
> >> >
> >>
> >>
> >> --
> >> Allan M. Espinosa <http://amespinosa.wordpress.com>
> >> PhD student, Computer Science
> >> University of Chicago <http://people.cs.uchicago.edu/~aespinosa>
> >> _______________________________________________
> >> Swift-devel mailing list
> >> Swift-devel at ci.uchicago.edu
> >> http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
> >
> > --
> > Michael Wilde
> > Computation Institute, University of Chicago
> > Mathematics and Computer Science Division
> > Argonne National Laboratory
> >
> >
> >
> 
> 
> 





More information about the Swift-devel mailing list