[Swift-devel] Clustering and Temp Dirs with Swift

Sun Oct 28 17:46:19 CDT 2007

On 10/28/07 5:27 PM, Mihael Hategan wrote:
> On Sun, 2007-10-28 at 17:15 -0500, Michael Wilde wrote:
> 
>> If these were numeric patterns, 
> 
> the string would be longer.

Right. But for now, if we engineered for 999,999 jobs, that would be a 
simple 6 digit number and it will be a while before we exceed that.

Depends also on our strategy for uniqueness. So far I dont see a need to 
make objects (jobs and files) unique across workflows, just within a 
workfow.

If a run a workflow twice in one dir, I'd like the system to either put 
my data in a new dataNNN dir thats unique to one run of my workflow, or 
to start numbering for auto-named files for one workflow where it left 
of in a previous workflow. (I guess this gets into mktemp-like issues, 
or .nextID files, etc).  In my work, by the way, my log-saving script 
also moves all my output data to a unique per-run directory, run01, 
run02, etc.

Workflow IDs dont need to be unique outside of a user or group.

Im happy to call my runs angle001, angle002, cnari001, etc.

Having said all that, I dont have strong feelings on it at this point, 
except to note that the small easy numbers make it easier on *most* 
user, for a long time, till their needs outgrow smaller local ID spaces.

I'd rather revisit UUID strategies again down the road when we hit that 
as a scalability problem, and keep simple things simpler for now.

This will be much nicer for examples, tutorials, etc in addition to most 
normal usage.

- Mike

> 
>> it would be easy to eg put 100 files per 
>> dir by taking say the leftmost 6 characters and making that a dirname 
>> within which the rightmost 2 chars would vary:
> 
> With alpha-numeric ones, it's fairly easy to put 37 files per dir.
> 
> Anyway. It doesn't matter. Either way. The problem isn't what exact
> numbering base we're using, but how exactly we put them in
> subdirectories.
> 
>> tlivaj/angle4-tlivajim-kickstart.xml
>> tlivaj/angle4-tlivajin-kickstart.xml
>> tlivaj/angle4-tlivajio-kickstart.xml
>> tlivaj/angle4-tlivajip-kickstart.xml
>> tlivaj/angle4-tlivajiq-kickstart.xml
>>
>> but easier on my eyes would be:
>> 000000/angle4-00000001-kickstart.xml
> 
> Well lg(37^9) =~ 14, so you need about 14 digits to cover the same range
> of values:
> 
> 00000000000000/angle4-00000000000001-kickstart.xml
> 
> 
>> 000000/angle4-00000002-kickstart.xml
>> ...
>> 000000/angle4-00000099-kickstart.xml
>> ...
>> 000020/angle4-00002076-kickstart.xml
>> etc.
>>
>> This makes splitting based on powers of 10 (or 26 or 36) trivial. Other 
>> splits can be done with mod() functions.
>>
>> Can we start heading in this or some similar direction?
>>
>> We need to coordinate a plan for this, I suspect, to make Andrew's 
>> workflows perform acceptably.
>>
>> - Mike
>>
>>
>>
>> On 10/27/07 2:08 PM, Ben Clifford wrote:
>>> On Sat, 27 Oct 2007, Mihael Hategan wrote:
>>>
>>>> Quickly before I leave the house:
>>>> Perhaps we could try copying to local FS instead of linking from shared
>>>> dir and hence running the jobs on the local FS.
>>> Maybe. I'd be suspicious that doesn't reduce access to the directory too 
>>> much.
>>>
>>> I think the directories where there are lots of files being read/written 
>>> by lots of hosts are:
>>>
>>> the top directory (one job directory per job)
>>> the info directory
>>> the kickstart directory
>>> the file cache
>>>
>>> In the case where directories get too many files in them because of 
>>> directory size constraints, its common to split that directory into many 
>>> smaller directories (eg. how squid caching, or git object storage works). 
>>> eg, given a file fubar.txt store it in fu/fubar.txt, with 'fu' being some 
>>> short hash of the filename (with the hash here being 'extract the first 
>>> two characters).
>>>
>>> Pretty much I think Andrew wanted to do that for his data files anyway, 
>>> which would then reflect in the layout of the data cache directory 
>>> structure.
>>>
>>> For job directories, it may not be too hard to split the big directories 
>>> into smaller ones. There will still be write-lock conflicts, but this 
>>> might mean the contention for each directories write-lock is lower.
>>>
> 
>