[Swift-devel] Re: replication vs site score

Thu Apr 9 12:32:11 CDT 2009

Mihael Hategan wrote:
> On Thu, 2009-04-09 at 10:04 -0700, Ioan Raicu wrote:
>   
>> There are use cases where static resource allocation are better than
>> dynamic ones. Again, we come back to the BG/P system. There is a
>> policy that only allows you to submit X number of jobs to Cobalt, and
>> X is < 10 jobs. Now, if you want to allocate resources dynamically, in
>> smaller chunks, you are limited to only a few jobs. Static
>> provisioning all of a sudden seems attractive.
>>     
>
> It's a valid scenario and a valid solution, but asserting that it's the
> only solution or that it's the best solution seems inappropriate.
>
> A better solution is to dynamically allocate workers in larger blocks if
> you don't have arbitrary granularity on the allocation sizes. It
> provides the middle ground that meets the scheduling system constraints
> and minimizes inefficiencies.
>   
When you allocate 1 node at a time, in dynamic provisioning, its trivial 
to de-allocate nodes/workers, with a timeout. When you allocate N nodes 
in a single job, where N>1, de-allocation is not trivial anymore. If a 
worker simply de-allocates (exit the process), that node remains 
allocated from the LRM's perspective, but de-allocated from the 
Coaster/Falkon perspective. When all N nodes are de-allocated, then the 
N nodes are released to the LRM. That is potentially a great deal of 
wastage. The better solution would be for there to be a centralized 
manager, that keeps track of an entire job (N nodes) and their 
utilization, and decide to de-allocate the entire N nodes at the same 
time, from both Coaster/Falkon and the LRM. Falkon doesn't support this 
unfortunately. Does Coaster support this? If not, then I'd say that 
dynamic resource provisioning has to be kept at jobs of a single node 
level, and not bunch together multiple node requests per job. This will 
obviously limit the use of dynamic provisioning for large scale runs, to 
LRMs that support large number of job submissions, proportional to the 
scale of the runs.

Don't get me wrong, I think dynamic resource provisioning is the best 
approach in general, especially when workloads vary in loads, and you 
have an infrastructure that supports it (i.e. TeraGrid), but its not 
suitable for other systems, with the current implementation that I am 
aware of from Falkon (maybe Coaster as well) on systems like the BG/P.
>   
>> Another thing that you have to remember, that for some systems, like
>> the BG/P, getting 2 allocations of 64 nodes each, is not the same as
>> getting 1 allocation of 128 nodes. The 1 single allocation of 128
>> nodes has networking configured in such a way to allow node-to-node
>> communication efficiently. The 2 separate allocations, could be
>> allocated in completely opposite ends of the system, and hence having
>> poor networking properties to do node-to-node communication, between
>> the separate allocations (if its even possible, I am not sure, the
>> networks might be completely separate). This might not be important
>> for vanilla Swift, but some of the MTDM work (previously known as
>> collective I/O) relies on good network connectivity between any node
>> in the allocation to pass data around and avoiding GPFS.
>>     
>
> I'm not sure what dynamic vs. static allocation of workers, in
> principle, has to do with the implementation hurdles of CIO on the BG/P.
>   
It has to do with the fact that if the network interconnect is important 
(such is the case for MTDM), then submitting multiple independent jobs 
to the LRM is detrimental to the overall performance of the application, 
as opposed to submitting a single job to the LRM. If the jobs are 
submitted to the LRM as independent jobs, there is no guarantee on their 
placement and proximity to each other (node wise).
> Different systems have different constraints. Dynamic allocation can be
> made to adapt to those constraints.
>   
But after adapting it enough, its going to look like static provisioning.

See this paper
http://pegasus.isi.edu/publications/2008/JuveG-ResourceProvisioningOptions.pdf

which discusses the various approaches to resource provisioning. You 
will find some systems support static provisioning, others suport 
dynamic provisioing, and others support both. It shows that there are 
clear use cases for one, the other, or both.

Ioan
>
>   

-- 
===================================================
Ioan Raicu, Ph.D.
===================================================
Distributed Systems Laboratory
Computer Science Department
University of Chicago
1100 E. 58th Street, Ryerson Hall
Chicago, IL 60637
===================================================
Email: iraicu at cs.uchicago.edu
Web:   http://www.cs.uchicago.edu/~iraicu
http://dev.globus.org/wiki/Incubator/Falkon
http://dsl-wiki.cs.uchicago.edu/index.php/Main_Page
===================================================
===================================================

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/swift-devel/attachments/20090409/b96c0aa2/attachment.html>