[Swift-devel] Clustering and Temp Dirs with Swift
Michael Wilde
wilde at mcs.anl.gov
Mon Oct 29 08:47:19 CDT 2007
I was suggesting that workflow IDs get their global uniqueness via a
composite name, not a single globally unique GUID.
As we collect data in a central place, I envision a hierarchy of
$SWIFT_LOGS/project/user/submithost/workflow-run/intermediate-dirs/objects
(or something similar)
This hierarchy doesnt have to be consistent or constant, as long as
there is as well-defined notion of a workflow's "run directory" and the
path to each run dir is unique. The the log processor will find everything.
As a user, having to constantly work in a space of "dense" unique names
is hard - its a source of cognitive dissonance.
If the system would give me a choice of using simpler name, nicely
balance my files over directories for performance, and accept my log
data for analysis, that would be great. But most important is that it
work well and fast.
Given a choice, I'd much rather work using the current "dissonant" names
than not work. So my comments on naming are a minor issue and we can put
them aside for now. (I will try harder to stop talking about this ;)
We're currently focusing on solving the performance problems and
continually enhancing the log processing for analysis (related). We
should keep doing that, and can review our file-naming issues in a few
months from now, unless naming changes are needed for directory balancing.
- Mike
On 10/29/07 7:35 AM, Ian Foster wrote:
> If they are not globally unique, don't we have problems when we combine logs from multiple sources?
>
> Sent via BlackBerry from T-Mobile
>
> -----Original Message-----
> From: Ben Clifford <benc at hawaga.org.uk>
>
> Date: Mon, 29 Oct 2007 08:47:04
> To:Michael Wilde <wilde at mcs.anl.gov>
> Cc:swiftdevel <swift-devel at ci.uchicago.edu>
> Subject: Re: [Swift-devel] Clustering and Temp Dirs with Swift
>
>
>
> On Sun, 28 Oct 2007, Michael Wilde wrote:
>
>> Workflow IDs dont need to be unique outside of a user or group.
>
> The way I've been thinking things would work with log file names (which to
> an extent overlaps with workflow IDs) is this:
>
> * Swift generates a log file name by default that is very unique
> (i.e. its present format is workflow name + timestamp + random)
>
> * The log file name can be overridden with the -log command line option
> (which was broken but I fixed it in r1357)
>
> * To get domain-specific log file naming with your own
> uniqueness rules (eg. a sequence number), use -log
> to specify that.
>
> I think the present log naming is a good way to name things in the absence
> of any domain-specific naming strategy; and I think -log is a good way for
> a domain specific naming strategy to be plugged in.
>
More information about the Swift-devel
mailing list