[Swift-devel] Error messages and execution retries

Mihael Hategan hategan at mcs.anl.gov
Wed Aug 18 15:50:37 CDT 2010


On Wed, 2010-08-18 at 14:42 -0600, Michael Wilde wrote:
> We should probably test to make sure that this is the case. For
> example, if 10 jobs are launched in parallel, and one fails, then even
> with lazy.errors=false, the 9 running jobs will still finish, right?

Not exactly.

They will only finish if retrying the failing job takes more time than
finishing the remaining 9.

>  Its just that no new ones will start. So will the error (stderr?)
> from the failing job be sent to the log or the swift stdout/err right
> away, or will it still wait for the full swift termination,

If lazy errors are disabled, as soon as all the retries for a job fail,
a message will be printed and the execution of the run aborted.

>  which still may get circumvented by a ^C ???

Btw, we could intercept the ^C and still print the errors.





More information about the Swift-devel mailing list