[Swift-devel] execution.retries
lixi at uchicago.edu
lixi at uchicago.edu
Tue Jun 10 11:20:39 CDT 2008
Hi,
Just now I ran a workflow, site "OSG_LIGO_MIT" had a GridFTP
error for a little while exactly after it was selected for
Swift job. So the message "Could not initialize shared
directory on OSG_LIGO_MIT" was issued and the whole workflow
exited. The log file is on
CI: /home/lixi/newswift/latest/score/3500/workflowtest-
20080610-1045-58kc7p6f.log
After investigating the log file, I found that this failed
job produced a execute event with id of 0-1-307. When it was
staging files, a temp GridFTP error on OSG_LIGO_MIT just
happened, so 0-1-307 didn't result in any execute2 event.
Finally, the whole workflow failed. My understanding is that
in Swift, the execution.retries just mean the retrying times
for execute2 events, is that right? Then currently how to
avoid or handle this kind of error? Is there is a way to do
with it in Swift now?
Thanks,
Xi
More information about the Swift-devel
mailing list