[Swift-devel] Clustering and Temp Dirs with Swift

Michael Wilde wilde at mcs.anl.gov
Mon Oct 29 08:47:19 CDT 2007


I was suggesting that workflow IDs get their global uniqueness via a 
composite name, not a single globally unique GUID.

As we collect data in a central place, I envision a hierarchy of 
$SWIFT_LOGS/project/user/submithost/workflow-run/intermediate-dirs/objects

(or something similar)

This hierarchy doesnt have to be consistent or constant, as long as 
there is as well-defined notion of a workflow's "run directory" and the 
path to each run dir is unique. The the log processor will find everything.

As a user, having to constantly work in a space of "dense" unique names 
is hard - its a source of cognitive dissonance.

If the system would give me a choice of using simpler name, nicely 
balance my files over directories for performance, and accept my log 
data for analysis, that would be great. But most important is that it 
work well and fast.

Given a choice, I'd much rather work using the current "dissonant" names 
than not work. So my comments on naming are a minor issue and we can put 
them aside for now. (I will try harder to stop talking about this ;)

We're currently focusing on solving the performance problems and 
continually enhancing the log processing for analysis (related). We 
should keep doing that, and can review our file-naming issues in a few 
months from now, unless naming changes are needed for directory balancing.

- Mike


On 10/29/07 7:35 AM, Ian Foster wrote:
> If they are not globally unique, don't we have problems when we combine logs from multiple sources?
> 
> Sent via BlackBerry from T-Mobile
> 
> -----Original Message-----
> From: Ben Clifford <benc at hawaga.org.uk>
> 
> Date: Mon, 29 Oct 2007 08:47:04 
> To:Michael Wilde <wilde at mcs.anl.gov>
> Cc:swiftdevel <swift-devel at ci.uchicago.edu>
> Subject: Re: [Swift-devel] Clustering and Temp Dirs with Swift
> 
> 
> 
> On Sun, 28 Oct 2007, Michael Wilde wrote:
> 
>> Workflow IDs dont need to be unique outside of a user or group.
> 
> The way I've been thinking things would work with log file names (which to 
> an extent overlaps with workflow IDs) is this:
> 
>   * Swift generates a log file name by default that is very unique
>     (i.e. its present format is workflow name + timestamp + random)
> 
>   * The log file name can be overridden with the -log command line option
>     (which was broken but I fixed it in r1357)
> 
>   * To get domain-specific log file naming with your own
>     uniqueness rules (eg. a sequence number), use -log
>     to specify that.
> 
> I think the present log naming is a good way to name things in the absence 
> of any domain-specific naming strategy; and I think -log is a good way for 
> a domain specific naming strategy to be plugged in.
> 



More information about the Swift-devel mailing list