[Swift-devel] Use case and examples needed to avoid large directories
Michael Wilde
wilde at mcs.anl.gov
Sun Sep 30 14:44:05 CDT 2007
This is a very intriguing idea, but a complex one that I'd pursue
further out in the future.
The approach, however, is exactly what we need to manually code in some
cases. This was true, for example, in the "softmean" level of many of
the fmri examples, where softmean needs to average a large set of
inputs, and its behavior is, we thing, associatve.
I think we should work on this in three steps:
1) work on the mapper behavior that Ben described and make sure that we
can represent a large dataset, with too many members to perform well in
one directory, as a tree of file. This step would involve making sure
we can maintain a consistent tree representation regardless of whether
the files are being produced or consumed by individual programs of by a
single program. (E.g, in Andrew's workflow, they are processed by a
sequence of individual invocations and then consumed later by a single
invocation.
2) Make it possible, through some combination of swift and external
scripting, to pass a long list of files to a program through a single
file. Since the target program is often coded to expect its input on the
command line, this requires a wrapper. The question is whether that
wrapper can or should be in swift. Seems simplest if its not in swift,
and we focus instead on making sure that swift, through mappers and
built-in functions, can hande the various dataset representations needed
for this. (tree-of-files, and file containining list of names of files
in that tree).
3) At a later date, explore Mihael's idea, if we see that this is a
common enough pattern to automate in this manner.
I suspect from Ben's various clarifications that we can do 1 and 2 above
with little or no language change. Ben, Andrew, Im eager to see how you
solve this and would love to see it in the language docs. Seems like we
could use a chapter called "Swift Cookbook" for handling common
application situations.
- Mike
On 9/29/07 12:22 PM, Mihael Hategan wrote:
> On Sat, 2007-09-29 at 12:14 -0500, Michael Wilde wrote:
>> - cmd lines to big for linux
>>
>
> Maybe we can somehow mark applications that are associative in one of
> its vector parameters.
>
> combine "-type" "x" "-files" @associative(@f) @result(@out);
>
> This would mean that Swift has the liberty of, say for f = 0...1000, to
> break things into:
>
> combine -type x -files 0...500 tmp1
> combine -type x -files 501...1000 tmp2
> combine -type x -files tmp1 tmp2 out
>
> Or something like that.
>
>> - Mike
>>
>> On 9/29/07 3:53 AM, Ben Clifford wrote:
>>> On Fri, 28 Sep 2007, Mihael Hategan wrote:
>>>
>>>> Getting mappers to do this in the first place is another matter, which
>>>> eludes me at the moment.
>>> Likely a custom mapper if you want a whole tree mapped into a structure.
>>> Mapping pieces of any one (sub)directory should be possible, at least in
>>> basic form, with the present mappers.
>>>
>>> Mapping a whole tree would not be hugely different from the simple_mapper
>>> (although it would be some modification). But I'd be interested on working
>>> with Andrew to get something done there that isn't a hack.
>>>
>
>
More information about the Swift-devel
mailing list