[Swift-devel] Interesting observation when running Swift

Mihael Hategan hategan at mcs.anl.gov
Tue Apr 10 13:24:41 CDT 2007


On Tue, 2007-04-10 at 18:13 +0000, Ian Foster wrote:
> Ok. I hope that can all come with a concise statement of how to do what is wanted here, as I can't follow these emails.

These things seem to have a tendency to become very fuzzy as you get
into the details, which is probably why there's no actual implementation
yet.

> 
> Sounds like a good example for the doc.
> 
> 
> Sent via BlackBerry from T-Mobile  
> 
> -----Original Message-----
> From: Mihael Hategan <hategan at mcs.anl.gov>
> Date: Tue, 10 Apr 2007 12:56:55 
> To:itf at mcs.anl.gov
> Cc:Veronika Nefedova <nefedova at mcs.anl.gov>,  swift-devel-bounces at ci.uchicago.edu,  Tibi Stef-Praun <tiberius at ci.uchicago.edu>, swift-devel at ci.uchicago.edu
> Subject: Re: [Swift-devel] Interesting observation when running Swift
> 
> On Tue, 2007-04-10 at 17:46 +0000, Ian Foster wrote:
> > I thought we could already do that. We can, I think, in the case of the first stage--we can take all files in a directory as a datset, say? But we can't do that for latter stages?
> 
> Not much to do with when the stages happen. It's more about knowing what
> to stage out.
> 
> Right now mappers can tell what files are relevant in a given collection
> (directory) that they manage, and the details are left to the
> implementations of the mappers. What's needed is to feed each mapper for
> return values from an atomic proc with the list of files generated by an
> application, let them select the relevant files, then do the stage-out
> and populate the Swift data structures based on that.
> 
> > 
> > 
> > Sent via BlackBerry from T-Mobile  
> > 
> > -----Original Message-----
> > From: "Veronika  V. Nefedova" <nefedova at mcs.anl.gov>
> > Date: Tue, 10 Apr 2007 12:12:12 
> > To:"Tiberiu Stef-Praun" <tiberius at ci.uchicago.edu>,  "Mihael Hategan" <hategan at mcs.anl.gov>
> > Cc:swift-devel at ci.uchicago.edu
> > Subject: Re: [Swift-devel] Interesting observation when running Swift
> > 
> > I think that something like that would be useful:
> > 
> > outputsStage1[]=stage1()
> > outputsStage2[]=stage2(outputsStage1[])
> > 
> > if you didn't have to specify the number or specific filenames for the 
> > outputs. Basically it would be good for the Workflow engine to understand 
> > this: "get all the produced files from stage 1 and use them as an input for 
> > Stage 2"
> > 
> > (;
> > 
> > Nika
> > 
> > 
> > At 11:57 AM 4/10/2007, Tiberiu Stef-Praun wrote:
> > >Interesting.
> > >
> > >Does anyone else think that monitoring the filesystem could be a useful idea ?
> > >
> > >For instance it could help with file-driven dependencies, in scenarios
> > >where we want to have continuous workflows, or compose independent
> > >wokflows. The filesystem would act as the publish-subscribe mechanism
> > >for some workflow cases.
> > >
> > >Tibi
> > >
> > >On 4/10/07, Mihael Hategan <hategan at mcs.anl.gov> wrote:
> > >>Swift doesn't monitor the file system.
> > >>Data driven doesn't mean that it does magic in the background. It means
> > >>that you have to express data dependencies in the code.
> > >>
> > >>On Tue, 2007-04-10 at 11:47 -0500, Tiberiu Stef-Praun wrote:
> > >> > I have a workflow along these lines:
> > >> >
> > >> > // this one generates outputsStage1[]
> > >> > stage1()
> > >> > // this one merges the stage1 outputs
> > >> > stage2(outputsStage1[])
> > >> >
> > >> > note that it is not outputsStage1=stage1()
> > >> >
> > >> > Since the outputsStage1 files were not generated yet, I expected that
> > >> > Karajan waited for them to be created before running stage2, but that
> > >> > was not the case: stage2 was executed when the workflow started (and
> > >> > it failed) and caused the workflow to fail.
> > >> >
> > >> > I know how to fix the workflow, that is not the issue. The issue is
> > >> > that I expected the workflow to be data-driven, but it seems to be
> > >> > code driven. Explanation: it attempted to execute a section even if
> > >> > its input files were not available.
> > >> >
> > >> > Correct me if I am wrong.
> > >> > Tibi
> > >> >
> > >>
> > >
> > >
> > >--
> > >Tiberiu (Tibi) Stef-Praun, PhD
> > >Research Staff, Computation Institute
> > >5640 S. Ellis Ave, #405
> > >University of Chicago
> > >http://www-unix.mcs.anl.gov/~tiberius/
> > >_______________________________________________
> > >Swift-devel mailing list
> > >Swift-devel at ci.uchicago.edu
> > >http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
> > 
> > 
> > _______________________________________________
> > Swift-devel mailing list
> > Swift-devel at ci.uchicago.edu
> > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
> > 
> 




More information about the Swift-devel mailing list