[Swift-devel] Try coaster on BG/P ?
Michael Wilde
wilde at mcs.anl.gov
Thu Jun 19 18:56:50 CDT 2008
Cool - very helpful info. It will still be 1-2 weeks before I can try
this. Perhaps Zhao can try it sooner.
Re:
> The basic pattern is as long as there are more jobs ready to be run than
> there are workers, then more workers will be started
This is where, as you suggested earlier, for BGP-like-systems would be
best if this was static and disabled.
What Ioan did in Falkon when he went to the multiple-server architecture
is relevant here: the client load-shares among all the servers,
round-robin, only sending a job to a server when it knows that the
server has a free cpu slot. In this way, no queues build up on the
servers, and it avoids having a job wait in any server's queue when a
free cpu might be available on some other server.
- Mike
On 6/19/08 6:45 PM, Ben Clifford wrote:
> more notes on running this on a normal site:
>
> Set the jobThrottle parameter for a site based on the mechanism used to
> submit coasters - that is, for a site that is submitting through GRAM2,
> set the throttle to 0.2, which will limit you to 20 jobs at once, and
> likely cause just under 20 coaster workers to run. When GRAM4 works,
> should be able to use a jobThrottle of 4 and get 400 workers to run (at
> least as far as the coaster-submission side of things).
>
> I have done any tests (though Mihael might have) about how many coaster
> workers can run sensibly at once - most of my testing has been poking
> round at the low end of things.
>
> When you start running jobs, the timings will look a bit different -
> you'll see a much longer delay than usual when the first job goes into
> execute state, as this is when the coaster master starts up on the remote
> site and submits a worker into the queue But once workers start actually
> executing, subsequent job executions will be faster (in as much as there
> should be no GRAM latency and no LRM latency).
>
> The basic pattern is as long as there are more jobs ready to be run than
> there are workers, then more workers will be started (but will be subject
> to LRM delays in starting up).
>
More information about the Swift-devel
mailing list