[Swift-devel] jobthrottle value does not correspond to number of parallel jobs on local provider
Michael Wilde
wilde at mcs.anl.gov
Tue Oct 23 12:23:34 CDT 2012
Hi Ketan,
In the log you attached I see this:
<profile key="jobThrottle" namespace="karajan">0.10</profile>
<profile namespace="karajan" key="initialScore">100000</profile>
You should leave initialScore constant, and set to a large number, no matter what level of manual throttling you want to specify via sites.xml. We always use 10000 for this value. Don't attempt to vary the initialScore value for manual throttle: just use jobThrottle to set what you want.
A jobThrottle value of 0.10 should run 11 jobs in parallel (jobThrottle * 100) + 1 (for historical reasons related to the automatic throttling algorithm).
If you are seeing less than that, one common cause is that the ratio of your input staging times to your job run times is so high as to make it impossible for Swift to keep the expected/desired number of jobs in active state at once.
I suggest you test the throttle behavior with a simple app script like "catsnsleep" (catsn with an artificial sleep to increase job duration). If your settings (sites + cf) work for that test, then they should work for the real app, within the staging constraints. Using CDM "direct" mode is likely what you want here to eliminate unnecessary staging on a local cluster.
In your test, what was this ratio? Can you also post your cf file and the progress log from stdout/stderr?
- Mike
----- Original Message -----
> From: "Ketan Maheshwari" <ketancmaheshwari at gmail.com>
> To: "Swift Devel" <swift-devel at ci.uchicago.edu>
> Sent: Tuesday, October 23, 2012 10:34:25 AM
> Subject: [Swift-devel] jobthrottle value does not correspond to number of parallel jobs on local provider
> Hi,
>
>
> I am trying to run an experiment on a 32-core machine with the hope of
> running 8, 16, 24 and 32 jobs in parallel. I am trying to control
> these numbers of parallel jobs by setting the Karajan jobthrottle
> values in sites.xml to 0.07, 0.15, and so on.
>
>
> However, it seems that the values are not corresponding to what I see
> in the Swift progress text.
>
>
> Initially, when I set jobthrottle to 0.07, only 2 jobs started in
> parallel. Then I added the line setting "Initialscore" value to 10000,
> which improved the jobs to 5. After this a 10-fold increase in
> "initialscore" did not improve the jobs count.
>
>
> Furthermore, a new batch of 5 jobs get started only when *all* jobs
> from the old batch are over as opposed to a continuous supply of jobs
> from "site selection" to "stage out" state which happens in the case
> of coaster and other providers.
>
>
> The behavior is same in Swift 0.93.1 and latest trunk.
>
>
>
> Thank you for any clues on how to set the expected number of parallel
> jobs to these values.
>
>
> Please find attached one such log of this run.
> Thanks, --
> Ketan
>
>
>
> _______________________________________________
> Swift-devel mailing list
> Swift-devel at ci.uchicago.edu
> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel
--
Michael Wilde
Computation Institute, University of Chicago
Mathematics and Computer Science Division
Argonne National Laboratory
More information about the Swift-devel
mailing list