[Swift-devel] several alternatives to design the data management system for Swift on SuperComputers
Mihael Hategan
hategan at mcs.anl.gov
Tue Dec 2 10:24:39 CST 2008
On Mon, 2008-12-01 at 23:24 -0600, Zhao Zhang wrote:
> Hi, Mihael
>
> I think the attached graph could answer your question.
Not really. Is there a test with 8192 pre-created directories?
>
> All the tests were run 2 racks, 8k cores, with 8192 jobs. Each file
> created by the test is 1KB.
>
> 1_DIR_5_FILE means, all 8192 cores are writing 5 files to 1 dir on
> GPFS, in this test, within 300 seconds only 31 jobs returned successful.
> 32_DIR_5_FILE , all 8192 cores are writing 5 files to the unique
> directory for IO node on GPFS. 8192 jobs took 91.026 seconds
> 1000_DIR_5_FILE , all 8192 cores are writing 5 files to 1000
> hierarchical directories on GPFS. 8192 jobs took 81.555 seconds
> 32_DIR_1_FILE , by batching the 5 output files, each core is wring one
> tarball to the directory unique for each IO node on GPFS, 8192 jobs took
> 23.616 seconds
> CIO_5_FILE , with CIO, each core write 5 files to IFS, 8192 jobs took
> 12.007 seconds.
>
>
> Then we could tell 32_DIR_5_FILE doesn't slow down the performance much
> comparing with
> 1000_DIR_5_FILE. And in this test case, each task is writing 5 files,
> and in the real case for CIO
> each IO node will write one tar ball at a time. So the performances of
> the two should be more closer.
>
> So, in CIO we use a unique directory for one IO node(keep in mind, each
> IO node has 256 workers).
> For the GPFS test case in the paper, we use the fixed number of 10x1000
> hierarchical directories for output.
>
> Does the above thing make the question clear?
>
> best wishes
> zhangzhao
>
> Mihael Hategan wrote:
> > On Mon, 2008-12-01 at 21:43 -0600, Ian Foster wrote:
> >
> >> Dear All:
> >>
> >> b) "Collective I/O": improving performance between intermediate file
> >> system and GPFS by aggregating many small operations into fewer large
> >> operations.
> >>
> >>
> >
> > This is a part that I'm having trouble understanding.
> >
> > The paper mentions distributing data to different directories (in 6.2.),
> > but not whether the experiment was done with that or not.
> > Are the measurements taken with applications writing data to the same
> > directory or a different directory for each application/node or was the
> > whole thing done with Swift?
> >
> >
> >
> >
More information about the Swift-devel
mailing list