[Swift-devel] bug 53
Mihael Hategan
hategan at mcs.anl.gov
Fri Sep 14 22:25:21 CDT 2007
On Thu, 2007-09-13 at 16:41 -0500, Mihael Hategan wrote:
> Ok, so there's something in.
That something was throttling a bit too much (not just jobs, but all
tasks on that site). I need to take a second look at it.
Mihael
> There are some discussions that can be had on certain aesthetic topics.
> In any event, in sites.xml, you can add, for a site, something like
> this:
>
> <profile namespace="karajan" key="maxSubmitRate">0.1</profile>
>
> The rate is in jobs per second. The above would mean one job every ten
> seconds.
>
> Mihael
>
> On Thu, 2007-09-13 at 15:23 +0000, Ben Clifford wrote:
> > Yes?
> >
> > On Thu, 13 Sep 2007, Mihael Hategan wrote:
> >
> > > May I still fix that bug though?
> > >
> > > On Thu, 2007-09-13 at 09:54 -0500, Ioan Raicu wrote:
> > > > Hi,
> > > > I am still working on the new feature for Falkon to avoid submitting
> > > > tasks to known bad nodes, and to perhaps do its own retries for failed
> > > > jobs with certain known errors (i.e. stale NFS handle). I should have
> > > > that ready for next week to try out. Once this new feature is in, we
> > > > could try MolDyn again to see how it behaves.
> > > >
> > > > About avoiding Falkon of MolDyn, I recall something about the
> > > > scalability/policies of GRAM/PBS to handle many con current jobs,
> > > > having to throttle job submissions to something around 1 job every 10
> > > > seconds (for sustained periods of time, short bursts could send
> > > > faster), and the fact that only a few 10s of nodes would be used
> > > > concurrently, even though the sites that it was running on had more
> > > > free nodes. I also think that MolDyn through GRAM/PBS was running
> > > > only 1 job per node, in essence only using 1 processor of the 2 per
> > > > node. I think the largest workflow Nika was able to run over GRAM/PBS
> > > > was 5 molecules, 421 jobs (but only 340 jobs in the large stage).
> > > > Nika, were there other problems you encountered?
> > > >
> > > > Ioan
> > > >
> > > > Mihael Hategan wrote:
> > > > > Very well Sir. I shall see to the priority of the issue being raised.
> > > > >
> > > > > On Thu, 2007-09-13 at 14:09 +0000, Ben Clifford wrote:
> > > > >
> > > > > > I think one of the main impediments to moldyn running with GRAM directly
> > > > > > is bug 53 which is a request for sumission rate limiting.
> > > > > >
> > > > > > It might be relatively easy to implement that and see how the MolDyn
> > > > > > workflow behaves then.
> > > > > >
> > > > > > I'm interested to see if Falkon can be avoided for this workflow.
> > > > > >
> > > > > >
> > > > >
> > > > > _______________________________________________
> > > > > Swift-devel mailing list
> > > > > Swift-devel at ci.uchicago.edu
> > > > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
> > > > >
> > > > >
> > > >
> > > > --
> > > > ============================================
> > > > Ioan Raicu
> > > > Ph.D. Student
> > > > ============================================
> > > > Distributed Systems Laboratory
> > > > Computer Science Department
> > > > University of Chicago
> > > > 1100 E. 58th Street, Ryerson Hall
> > > > Chicago, IL 60637
> > > > ============================================
> > > > Email: iraicu at cs.uchicago.edu
> > > > Web: http://www.cs.uchicago.edu/~iraicu
> > > > http://dsl.cs.uchicago.edu/
> > > > ============================================
> > > > ============================================
> > >
> > >
> >
>
> _______________________________________________
> Swift-devel mailing list
> Swift-devel at ci.uchicago.edu
> http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
>
More information about the Swift-devel
mailing list