[Swift-devel] Re: replication vs site score

Ioan Raicu iraicu at cs.uchicago.edu
Thu Apr 9 12:34:26 CDT 2009



Mihael Hategan wrote:
> On Thu, 2009-04-09 at 10:12 -0700, Ioan Raicu wrote:
>   
>>>  
>>>       
>> No workers sit idle, waiting for other workers to start. The resource 
>> allocation takes some amount of time to boot up the OS on each node, 
>> mount GPFS, start Falkon service, start Falkon workers, etc... see 
>> http://dev.globus.org/wiki/Image:Falkon-BGP-startup-time.jpg. Its true 
>> that there is some difference between the 1st worker starting, and the 
>> last worker starting, probably on the order of seconds to maybe minutes 
>> at the largest scale of 160K processors. If this is a concern, the idle 
>> time as the system starts up, you can start Swift before 100% of the 
>> system is operational. The system is partitioned in 64 node chunks, so, 
>> in theory, Swift could start as soon as 64 nodes are online. Although, 
>> this could also have its own problems.
>>     
>
> This assumes a single site and exact knowledge of how to fit the
> workload.
>   
Nope, its a single site if you want to start at the earliest possible 
time, but once all nodes are started, it becomes a multi-site 
allocation, where each site is a 64 node chunk of the allocation.
> I also assume this works when you have a reservation, otherwise you may
> have better chances with smaller chunks.
>   
Up to 8K cores, we usually run without reservations. Beyond that, we do 
get reservations.
>
>   

-- 
===================================================
Ioan Raicu, Ph.D.
===================================================
Distributed Systems Laboratory
Computer Science Department
University of Chicago
1100 E. 58th Street, Ryerson Hall
Chicago, IL 60637
===================================================
Email: iraicu at cs.uchicago.edu
Web:   http://www.cs.uchicago.edu/~iraicu
http://dev.globus.org/wiki/Incubator/Falkon
http://dsl-wiki.cs.uchicago.edu/index.php/Main_Page
===================================================
===================================================

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/swift-devel/attachments/20090409/5e622013/attachment.html>


More information about the Swift-devel mailing list