[Swift-devel] several alternatives to design the data management system for Swift on SuperComputers

Ian Foster foster at anl.gov
Mon Dec 1 21:43:28 CST 2008


Dear All:

I am finding it hard to sort through this chain of emails, but I  
wanted to make a couple of points.

Zhao, Allan, Ioan, et al., have demonstrated considerable benefits  
from applying two methods to Swift-like workloads on the BG/P:

a) "Storage hierarchy": the use of federated per-node storage (RAM on  
BG/P, could be local disk on other systems) as an "intermediate file  
system" layer in the storage hierarchy between the ultra-fast but low- 
capacity local storage and the high-capacity but slower GPFS.

b) "Collective I/O": improving performance between intermediate file  
system and GPFS by aggregating many small operations into fewer large  
operations.

These are both well-known, extensively studied, and proven methods.  
Furthermore, we have some nice performance data that allows us to  
quantify their benefits in our specific situation. Perhaps it would be  
worth looking at the methods from that perspective.

Ian.


On Dec 1, 2008, at 9:32 PM, Ioan Raicu wrote:

> But its not just about directories and GPFS locking.... its about 8  
> or 16 large servers with 10Gb/s network connectivity (as is the case  
> for GPFS) compared to potentially 40K servers, each with 1Gb/s  
> connectivity (as would be the case in our example).  The potential  
> raw throughput of the later case, when we use all 40K nodes as  
> servers to the file system, is orders of magnitude larger than a  
> static configuration with 8 or 16 servers.  Its not yet clear we can  
> actually achieve anything close to the upper bound of performance at  
> full scale, but it should be obvious that the performance  
> characteristics will be quite different between GPFS and CIO.
>
> Ioan
>
> Mihael Hategan wrote:
>>
>> On Mon, 2008-12-01 at 17:10 -0600, Ioan Raicu wrote:
>>
>>> Mihael Hategan wrote:
>>>
>>>> On Mon, 2008-12-01 at 16:52 -0600, Ioan Raicu wrote:
>>>>
>>>>
>>>> ...
>>>>
>>>>
>>>>
>>>>> I don't think you realize how expensive GPFS access is when  
>>>>> doing so
>>>>> at 100K CPU scale.
>>>>>
>>>>>
>>>> I don't think I understand what you mean by "access". As I said,  
>>>> things
>>>> that generate contention are going to be slow.
>>>>
>>>> If the problem requires that contention to happen, then it doesn't
>>>> matter what the solution is. If it does not, then I suspect that  
>>>> there
>>>> is a way to avoid contention in GPFS, too (sticking things in  
>>>> different
>>>> directories).
>>>>
>>>>
>>> The basic idea is that many smaller shared file systems will scale
>>> better than 1 large file system, as the contention is localized.
>>>
>> Which is the same behaviour you get if you have a hierarchy of
>> directories. This is what Ben implemented in Swift.
>>
>>
>>>  The problem is that having 1 global namespace is simple and  
>>> straight
>>> forward, but having N local namespaces is not, and requires extra
>>> management.
>>>
>> Right. That's why most filesystems I know of treat directories as
>> independent files containing file metadata (aka. "local namespaces").
>>
>>
>>
>
> -- 
> ===================================================
> Ioan Raicu
> Ph.D. Candidate
> ===================================================
> Distributed Systems Laboratory
> Computer Science Department
> University of Chicago
> 1100 E. 58th Street, Ryerson Hall
> Chicago, IL 60637
> ===================================================
> Email: iraicu at cs.uchicago.edu
> Web:   http://www.cs.uchicago.edu/~iraicu
> http://dev.globus.org/wiki/Incubator/Falkon
> http://dsl-wiki.cs.uchicago.edu/index.php/Main_Page
> ===================================================
> ===================================================
> _______________________________________________
> Swift-devel mailing list
> Swift-devel at ci.uchicago.edu
> http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/swift-devel/attachments/20081201/d20f190a/attachment.html>


More information about the Swift-devel mailing list