[Swift-devel] Need precise throttle on local provider

wilde at mcs.anl.gov wilde at mcs.anl.gov
Tue Sep 28 19:29:27 CDT 2010


Mihael,

I have the need (for the Swift R interface) to either throttle the local execution provider to run *exactly* one job at a time, or to enhance the provider to set a SWIFT_JOB_SLOT env var to a value that signifies a virtual "slot number" for N concurrent jobs being run by the provider.

I use this env var to associate Swift jobs with persistent R evaluation servers that need to run serially: they can handle only one job at a time.

Ive modified the coaster worker.pl script to do this and it works very well.

I'm now trying to get the same behavior from the local execution provider, and rather than tackle inserting this into the Java provider code, I tried the shortcut of configuring a small set of local provider pool entries, each with the throttle set to what I *thought* would guarantee me no more than one job at at time running on each "pool":

    <profile key="jobThrottle" namespace="karajan">-0.001</profile>
    <profile namespace="karajan" key="initialScore">10000</profile>

I thought that the correct value for jobThrottle would be 0.0 to ensure 1 job, but from experimentation I found that I needed to set it to a slightly negative value, as above (-0.001).

But it seems like even this is not sufficient: under heavy load, Im seeing a second job start on the same pool before the prior job has completed (I use "mkdir" as a pseudo-mutex, and Im running on a local filesystem under /tmp).

So, my first question is: Is there some set of throttling or other sites.xml entries that will ensure <= 1 job per local provider pool?

Second question: If you can point me to the right place, Justin or I could do this the "right" way by modifying the local execution provider set set "SLOT" numbers.  I initially thought the current hack would be easier, and it seemed to work under standalone testing, but seems to be failing now in the live setting.

Thanks,

Mike




-- 
Michael Wilde
Computation Institute, University of Chicago
Mathematics and Computer Science Division
Argonne National Laboratory




More information about the Swift-devel mailing list