[Swift-devel] Interesting observation when running Swift

Mihael Hategan hategan at mcs.anl.gov
Tue Apr 10 12:23:06 CDT 2007


On Tue, 2007-04-10 at 12:12 -0500, Veronika V. Nefedova wrote:
> I think that something like that would be useful:
> 
> outputsStage1[]=stage1()
> outputsStage2[]=stage2(outputsStage1[])
> 
> if you didn't have to specify the number or specific filenames for the 
> outputs. Basically it would be good for the Workflow engine to understand 
> this: "get all the produced files from stage 1 and use them as an input for 
> Stage 2"

Some applications are known to produce extra temporary files. Mappers
are supposed to be able to extract arrays from a cluttered file system
(assuming that there are no ambiguities in naming patterns). So yes, it
would be useful, and I think one of the planned features, and I see no
obvious problems besides the possible naming conflicts (which would
apply to a local file system anyway, so not a new problem).

> 
> (;
> 
> Nika
> 
> 
> At 11:57 AM 4/10/2007, Tiberiu Stef-Praun wrote:
> >Interesting.
> >
> >Does anyone else think that monitoring the filesystem could be a useful idea ?
> >
> >For instance it could help with file-driven dependencies, in scenarios
> >where we want to have continuous workflows, or compose independent
> >wokflows. The filesystem would act as the publish-subscribe mechanism
> >for some workflow cases.
> >
> >Tibi
> >
> >On 4/10/07, Mihael Hategan <hategan at mcs.anl.gov> wrote:
> >>Swift doesn't monitor the file system.
> >>Data driven doesn't mean that it does magic in the background. It means
> >>that you have to express data dependencies in the code.
> >>
> >>On Tue, 2007-04-10 at 11:47 -0500, Tiberiu Stef-Praun wrote:
> >> > I have a workflow along these lines:
> >> >
> >> > // this one generates outputsStage1[]
> >> > stage1()
> >> > // this one merges the stage1 outputs
> >> > stage2(outputsStage1[])
> >> >
> >> > note that it is not outputsStage1=stage1()
> >> >
> >> > Since the outputsStage1 files were not generated yet, I expected that
> >> > Karajan waited for them to be created before running stage2, but that
> >> > was not the case: stage2 was executed when the workflow started (and
> >> > it failed) and caused the workflow to fail.
> >> >
> >> > I know how to fix the workflow, that is not the issue. The issue is
> >> > that I expected the workflow to be data-driven, but it seems to be
> >> > code driven. Explanation: it attempted to execute a section even if
> >> > its input files were not available.
> >> >
> >> > Correct me if I am wrong.
> >> > Tibi
> >> >
> >>
> >
> >
> >--
> >Tiberiu (Tibi) Stef-Praun, PhD
> >Research Staff, Computation Institute
> >5640 S. Ellis Ave, #405
> >University of Chicago
> >http://www-unix.mcs.anl.gov/~tiberius/
> >_______________________________________________
> >Swift-devel mailing list
> >Swift-devel at ci.uchicago.edu
> >http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
> 
> 




More information about the Swift-devel mailing list