[Swift-devel] several alternatives to design the data management system for Swift on SuperComputers

Tue Dec 2 10:39:32 CST 2008

On Tue, 2008-12-02 at 10:26 -0600, Zhao Zhang wrote:
> 
> Mihael Hategan wrote:
> > On Mon, 2008-12-01 at 23:24 -0600, Zhao Zhang wrote:
> >   
> >> Hi, Mihael
> >>
> >> I think the attached graph could answer your question.
> >>     
> >
> > Not really. Is there a test with 8192 pre-created directories?
> >   
> nope, why do you think there is 8192 pre-created directories for 2 rack 
> test?

I don't. It's something you would do.

>  The case is not one unique dir for each worker, but each dir for 
> one IO node
> for the CIO test.
> 
> zhao
> >   
> >> All the tests were run 2 racks, 8k cores, with 8192 jobs. Each file 
> >> created by the test is 1KB.
> >>
> >> 1_DIR_5_FILE means,  all 8192 cores are writing 5 files to 1 dir on 
> >> GPFS, in this test, within 300 seconds only 31 jobs returned successful.
> >> 32_DIR_5_FILE , all 8192 cores are writing 5 files to the unique 
> >> directory for IO node on GPFS. 8192 jobs took 91.026 seconds
> >> 1000_DIR_5_FILE , all 8192 cores are writing 5 files to 1000 
> >> hierarchical directories on GPFS. 8192 jobs took 81.555 seconds
> >> 32_DIR_1_FILE , by batching the 5 output files, each core is wring one 
> >> tarball to the directory unique for each IO node on GPFS, 8192 jobs took 
> >> 23.616 seconds
> >> CIO_5_FILE , with CIO, each core write 5 files to IFS, 8192 jobs took 
> >> 12.007 seconds.
> >>
> >>
> >> Then we could tell 32_DIR_5_FILE doesn't slow down the performance much 
> >> comparing with
> >> 1000_DIR_5_FILE. And in this test case, each task is writing 5 files, 
> >> and in the real case for CIO
> >> each IO node will write one tar ball at a time. So the performances of 
> >> the two should be more closer.
> >>
> >> So, in CIO we use a unique directory for one IO node(keep in mind, each 
> >> IO node has 256 workers).
> >> For the GPFS test case in the paper, we use the fixed number of 10x1000 
> >> hierarchical directories for output.
> >>
> >> Does the above thing make the question clear?
> >>
> >> best wishes
> >> zhangzhao
> >>
> >> Mihael Hategan wrote:
> >>     
> >>> On Mon, 2008-12-01 at 21:43 -0600, Ian Foster wrote:
> >>>   
> >>>       
> >>>> Dear All:
> >>>>
> >>>> b) "Collective I/O": improving performance between intermediate file
> >>>> system and GPFS by aggregating many small operations into fewer large
> >>>> operations.
> >>>>
> >>>>     
> >>>>         
> >>> This is a part that I'm having trouble understanding.
> >>>
> >>> The paper mentions distributing data to different directories (in 6.2.),
> >>> but not whether the experiment was done with that or not.
> >>> Are the measurements taken with applications writing data to the same
> >>> directory or a different directory for each application/node or was the
> >>> whole thing done with Swift?
> >>>
> >>>
> >>>
> >>>   
> >>>       
> >
> >
> >