[Swift-devel] Clustering and Temp Dirs with Swift

Ben Clifford benc at hawaga.org.uk
Sat Oct 27 14:08:14 CDT 2007



On Sat, 27 Oct 2007, Mihael Hategan wrote:

> Quickly before I leave the house:
> Perhaps we could try copying to local FS instead of linking from shared
> dir and hence running the jobs on the local FS.

Maybe. I'd be suspicious that doesn't reduce access to the directory too 
much.

I think the directories where there are lots of files being read/written 
by lots of hosts are:

the top directory (one job directory per job)
the info directory
the kickstart directory
the file cache

In the case where directories get too many files in them because of 
directory size constraints, its common to split that directory into many 
smaller directories (eg. how squid caching, or git object storage works). 
eg, given a file fubar.txt store it in fu/fubar.txt, with 'fu' being some 
short hash of the filename (with the hash here being 'extract the first 
two characters).

Pretty much I think Andrew wanted to do that for his data files anyway, 
which would then reflect in the layout of the data cache directory 
structure.

For job directories, it may not be too hard to split the big directories 
into smaller ones. There will still be write-lock conflicts, but this 
might mean the contention for each directories write-lock is lower.

-- 



More information about the Swift-devel mailing list