[Swift-devel] [Bug 72] Campaign for scaling wf up to 244 molecules
Ian Foster
foster at mcs.anl.gov
Tue Jul 17 22:35:24 CDT 2007
Great! What resource acquisition policy are you using?
>> b) Why did it take so long to get all of the workers working?
> I finally had enough confidence in the dynamic resource provisioning
> that we won't loose any jobs across resource allocation boundaries
> (ran lots of tests and they were all positive), so I enabled it for
> this run. I set the max to be the entire ANL site (274 processors)...
> and we got 146 at the beginning, and with time, the # of processors
> kept increasing up to the peak of 208 or so... the rest up to 274 were
> queued up in the PBS wait queue. The difference between the beginning
> with 146 and the end with 208 was that others who were in the system
> at the beginning finished their work and released some nodes, and idle
> processors went from the wait queue into the run queue. I would
> actually be curious to try out the latest DRP stuff on a busy site,
> such as Purdue or NCSA, and to see if we can maintain a nice pool size
> over a period of time, despite the sites being busy...
>
> BTW, in the previous runs for MolDyn, we normally set the min and max
> to say 100 processors, or 200 processors, and we would wait until we
> had all of them before we started... sometimes, this meant waiting
> 12~24 hours for enough nodes to become free so the large job could
> start. With DRP, you can start off with whatever the site has
> available, and you get more with time as your jobs make it through the
> wait queue and other jobs that are running complete...
More information about the Swift-devel
mailing list