[Swift-devel] Clustering and Temp Dirs with Swift
Mihael Hategan
hategan at mcs.anl.gov
Sat Oct 27 14:17:18 CDT 2007
On Sat, 2007-10-27 at 19:08 +0000, Ben Clifford wrote:
>
> On Sat, 27 Oct 2007, Mihael Hategan wrote:
>
> > Quickly before I leave the house:
Hmm. How naive.
> > Perhaps we could try copying to local FS instead of linking from shared
> > dir and hence running the jobs on the local FS.
>
> Maybe. I'd be suspicious that doesn't reduce access to the directory too
> much.
>
> I think the directories where there are lots of files being read/written
> by lots of hosts are:
>
> the top directory (one job directory per job)
> the info directory
> the kickstart directory
> the file cache
>
> In the case where directories get too many files in them because of
> directory size constraints, its common to split that directory into many
> smaller directories (eg. how squid caching, or git object storage works).
> eg, given a file fubar.txt store it in fu/fubar.txt, with 'fu' being some
> short hash of the filename (with the hash here being 'extract the first
> two characters).
>
> Pretty much I think Andrew wanted to do that for his data files anyway,
> which would then reflect in the layout of the data cache directory
> structure.
>
> For job directories, it may not be too hard to split the big directories
> into smaller ones. There will still be write-lock conflicts, but this
> might mean the contention for each directories write-lock is lower.
Right. Some of these are easy to avoid and some are harder.
The hash idea is brilliant. I think.
>
More information about the Swift-devel
mailing list