[Swift-devel] Interesting observation when running Swift
Ian Foster
itf at mcs.anl.gov
Tue Apr 10 12:46:08 CDT 2007
I thought we could already do that. We can, I think, in the case of the first stage--we can take all files in a directory as a datset, say? But we can't do that for latter stages?
Sent via BlackBerry from T-Mobile
-----Original Message-----
From: "Veronika V. Nefedova" <nefedova at mcs.anl.gov>
Date: Tue, 10 Apr 2007 12:12:12
To:"Tiberiu Stef-Praun" <tiberius at ci.uchicago.edu>, "Mihael Hategan" <hategan at mcs.anl.gov>
Cc:swift-devel at ci.uchicago.edu
Subject: Re: [Swift-devel] Interesting observation when running Swift
I think that something like that would be useful:
outputsStage1[]=stage1()
outputsStage2[]=stage2(outputsStage1[])
if you didn't have to specify the number or specific filenames for the
outputs. Basically it would be good for the Workflow engine to understand
this: "get all the produced files from stage 1 and use them as an input for
Stage 2"
(;
Nika
At 11:57 AM 4/10/2007, Tiberiu Stef-Praun wrote:
>Interesting.
>
>Does anyone else think that monitoring the filesystem could be a useful idea ?
>
>For instance it could help with file-driven dependencies, in scenarios
>where we want to have continuous workflows, or compose independent
>wokflows. The filesystem would act as the publish-subscribe mechanism
>for some workflow cases.
>
>Tibi
>
>On 4/10/07, Mihael Hategan <hategan at mcs.anl.gov> wrote:
>>Swift doesn't monitor the file system.
>>Data driven doesn't mean that it does magic in the background. It means
>>that you have to express data dependencies in the code.
>>
>>On Tue, 2007-04-10 at 11:47 -0500, Tiberiu Stef-Praun wrote:
>> > I have a workflow along these lines:
>> >
>> > // this one generates outputsStage1[]
>> > stage1()
>> > // this one merges the stage1 outputs
>> > stage2(outputsStage1[])
>> >
>> > note that it is not outputsStage1=stage1()
>> >
>> > Since the outputsStage1 files were not generated yet, I expected that
>> > Karajan waited for them to be created before running stage2, but that
>> > was not the case: stage2 was executed when the workflow started (and
>> > it failed) and caused the workflow to fail.
>> >
>> > I know how to fix the workflow, that is not the issue. The issue is
>> > that I expected the workflow to be data-driven, but it seems to be
>> > code driven. Explanation: it attempted to execute a section even if
>> > its input files were not available.
>> >
>> > Correct me if I am wrong.
>> > Tibi
>> >
>>
>
>
>--
>Tiberiu (Tibi) Stef-Praun, PhD
>Research Staff, Computation Institute
>5640 S. Ellis Ave, #405
>University of Chicago
>http://www-unix.mcs.anl.gov/~tiberius/
>_______________________________________________
>Swift-devel mailing list
>Swift-devel at ci.uchicago.edu
>http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
_______________________________________________
Swift-devel mailing list
Swift-devel at ci.uchicago.edu
http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
More information about the Swift-devel
mailing list