[Swift-devel] several alternatives to design the data management system for Swift on SuperComputers
Zhao Zhang
zhaozhang at uchicago.edu
Tue Dec 2 10:26:31 CST 2008
Mihael Hategan wrote:
> On Mon, 2008-12-01 at 23:24 -0600, Zhao Zhang wrote:
>
>> Hi, Mihael
>>
>> I think the attached graph could answer your question.
>>
>
> Not really. Is there a test with 8192 pre-created directories?
>
nope, why do you think there is 8192 pre-created directories for 2 rack
test? The case is not one unique dir for each worker, but each dir for
one IO node
for the CIO test.
zhao
>
>> All the tests were run 2 racks, 8k cores, with 8192 jobs. Each file
>> created by the test is 1KB.
>>
>> 1_DIR_5_FILE means, all 8192 cores are writing 5 files to 1 dir on
>> GPFS, in this test, within 300 seconds only 31 jobs returned successful.
>> 32_DIR_5_FILE , all 8192 cores are writing 5 files to the unique
>> directory for IO node on GPFS. 8192 jobs took 91.026 seconds
>> 1000_DIR_5_FILE , all 8192 cores are writing 5 files to 1000
>> hierarchical directories on GPFS. 8192 jobs took 81.555 seconds
>> 32_DIR_1_FILE , by batching the 5 output files, each core is wring one
>> tarball to the directory unique for each IO node on GPFS, 8192 jobs took
>> 23.616 seconds
>> CIO_5_FILE , with CIO, each core write 5 files to IFS, 8192 jobs took
>> 12.007 seconds.
>>
>>
>> Then we could tell 32_DIR_5_FILE doesn't slow down the performance much
>> comparing with
>> 1000_DIR_5_FILE. And in this test case, each task is writing 5 files,
>> and in the real case for CIO
>> each IO node will write one tar ball at a time. So the performances of
>> the two should be more closer.
>>
>> So, in CIO we use a unique directory for one IO node(keep in mind, each
>> IO node has 256 workers).
>> For the GPFS test case in the paper, we use the fixed number of 10x1000
>> hierarchical directories for output.
>>
>> Does the above thing make the question clear?
>>
>> best wishes
>> zhangzhao
>>
>> Mihael Hategan wrote:
>>
>>> On Mon, 2008-12-01 at 21:43 -0600, Ian Foster wrote:
>>>
>>>
>>>> Dear All:
>>>>
>>>> b) "Collective I/O": improving performance between intermediate file
>>>> system and GPFS by aggregating many small operations into fewer large
>>>> operations.
>>>>
>>>>
>>>>
>>> This is a part that I'm having trouble understanding.
>>>
>>> The paper mentions distributing data to different directories (in 6.2.),
>>> but not whether the experiment was done with that or not.
>>> Are the measurements taken with applications writing data to the same
>>> directory or a different directory for each application/node or was the
>>> whole thing done with Swift?
>>>
>>>
>>>
>>>
>>>
>
>
>
More information about the Swift-devel
mailing list