[Swift-devel] [Bug 72] Campaign for scaling wf up to 244 molecules
bugzilla-daemon at mcs.anl.gov
bugzilla-daemon at mcs.anl.gov
Sun Jul 1 00:09:12 CDT 2007
http://bugzilla.mcs.anl.gov/swift/show_bug.cgi?id=72
------- Comment #7 from iraicu at cs.uchicago.edu 2007-07-01 00:09 -------
(In reply to comment #6)
> (In reply to comment #4)
> > Hi again,
> > Here is an update of yesterday's 244 molecule run. The experiment ran further
> > than before, but it still did not complete. There were 240 molecules that
> > completed successfully (in the previous run, no molecule finished), but 4
> > molecules still did not finish.
> >
>
> Actually it looks tasks worked fine:
> bash-3.1$ cat MolDyn-244-63ar6atbg2ae1.log |grep "type=1.*ubmitted"|wc
> 24309 243090 2806214
> bash-3.1$ cat MolDyn-244-63ar6atbg2ae1.log |grep "type=1.*ailed"|wc
> 3614 36140 405816
> bash-3.1$ cat MolDyn-244-63ar6atbg2ae1.log |grep "type=1.*ompleted"|wc
> 20695 206950 2389556
>
> All tasks are accounted for. It may be that some jobs failed 3 times in a row.
> From the logs it looks like the workflow almost finished and it got to the
> point where the error reporting was to be done. Perhaps the stack overflow that
> you saw occurred there, and perhaps the impossible size of the workflow might
> have something to do with it.
>
The same machine (tg-v024) that we had trouble with before acted up again, I
should have removed it before we started the experiment. If this is the
consensus, we can certainly try it again, and make sure this machine is not in
the resource pool. Another idea is to increase the retry # from 3 to something
higher, maybe 10, 30, etc? Jobs can be resubmitted relatively fast with
Falkon, so retrying many times is not a big overhead... except that it takes
longer for Swift to give up!
Ioan
--
Configure bugmail: http://bugzilla.mcs.anl.gov/swift/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.
More information about the Swift-devel
mailing list