[Swift-devel] [Bug 72] Campaign for scaling wf up to 244 molecules

Mihael Hategan hategan at mcs.anl.gov
Sun Jul 1 02:11:49 CDT 2007


On Sun, 2007-07-01 at 01:53 -0500, Mihael Hategan wrote:
> On Sun, 2007-07-01 at 02:10 +0000, Ian Foster wrote:
> > Why do you say the workflow's size was "impossible"? It doesn't seem that large to me. We'd like to run larger ones!
> 
> Most certainly so. However, we want to make use of loops rather than
> generating large swift files.

Ok. I see. I meant impossible size of the source file. We clearly want
to be running workflows with that many jobs smoothly. I just don't think
large source files (whether Swift or Karajan) are a good way to do it.
I'm quite (pleasantly) surprised that Swift/Karajan can load and run XML
files with 1M+ lines.

Of course, that doesn't mean we shouldn't try to fix the problems that
might arise with large source files if possible.

> 
> > 
> > 
> > Sent via BlackBerry from T-Mobile
> > 
> > -----Original Message-----
> > From: bugzilla-daemon at mcs.anl.gov
> > 
> > Date: Sat, 30 Jun 2007 17:52:07 
> > To:swift-devel at ci.uchicago.edu
> > Subject: [Swift-devel] [Bug 72] Campaign for scaling wf up to 244 molecules
> > 
> > 
> > http://bugzilla.mcs.anl.gov/swift/show_bug.cgi?id=72
> > 
> > 
> > 
> > 
> > 
> > ------- Comment #6 from hategan at mcs.anl.gov  2007-06-30 17:52 -------
> > (In reply to comment #4)
> > > Hi again,
> > > Here is an update of yesterday's 244 molecule run.  The experiment ran further
> > > than before, but it still did not complete.  There were 240 molecules that
> > > completed successfully (in the previous run, no molecule finished), but 4
> > > molecules still did not finish. 
> > > 
> > 
> > Actually it looks tasks worked fine:
> > bash-3.1$ cat MolDyn-244-63ar6atbg2ae1.log |grep "type=1.*ubmitted"|wc
> >   24309  243090 2806214
> > bash-3.1$ cat MolDyn-244-63ar6atbg2ae1.log |grep "type=1.*ailed"|wc
> >    3614   36140  405816
> > bash-3.1$ cat MolDyn-244-63ar6atbg2ae1.log |grep "type=1.*ompleted"|wc
> >   20695  206950 2389556
> > 
> > All tasks are accounted for. It may be that some jobs failed 3 times in a row.
> > >From the logs it looks like the workflow almost finished and it got to the
> > point where the error reporting was to be done. Perhaps the stack overflow that
> > you saw occurred there, and perhaps the impossible size of the workflow might
> > have something to do with it.
> > 
> > 
> > _______________________________________________
> > Swift-devel mailing list
> > Swift-devel at ci.uchicago.edu
> > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
> 
> _______________________________________________
> Swift-devel mailing list
> Swift-devel at ci.uchicago.edu
> http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
> 




More information about the Swift-devel mailing list