[Swift-devel] data staging process/ documents?

Michael Wilde wilde at mcs.anl.gov
Thu Feb 12 10:25:15 CST 2009



On 2/12/09 10:21 AM, Michael Wilde wrote:
> 
> 
> On 2/11/09 7:07 PM, Allan Espinosa wrote:
>> Hi,
>>
>> I am attempting to actualize how collective operations on workflows
>> (loosely-coupled) work in general.  My initial idea is that this goes
>> in the staging of data before executing a task in a workflow.
>>
>> Do we have documents describing these?
> 
> I think the email below from Ben is relevant, and referes to a prios 
> post on swift-devel. Im not sure if that text has made it to the 
> userguide yet.

Prior post was: 
http://mail.ci.uchicago.edu/pipermail/swift-devel/2008-December/004070.html

I think that info has been added to the userguide.

- Mike

> 
> - Mike
> 
> 
> I have a small idea on how it
>> works by monitoring my swift job's <workdir/> as a workflow executes.
>>
>> My initial ideas are posted in
>> http://www.ci.uchicago.edu/wiki/bin/view/VDS/DslCS/CollectiveIO
>>
>> -Allan
>> _______________________________________________
>> Swift-devel mailing list
>> Swift-devel at ci.uchicago.edu
>> http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
> 
> 
> 
> 
> -------- Original Message --------
> Subject: [Swift-devel] notes on how swift implements file input and output
> Date: Mon, 1 Dec 2008 22:00:16 +0000 (GMT)
> From: Ben Clifford <benc at hawaga.org.uk>
> To: swift-devel at ci.uchicago.edu
> References: <Pine.LNX.4.64.0812011854470.2448 at dildano.hawaga.org.uk>
> 
> 
> read this in conjunction with previous note, "Subject: User perspective on
> how an app procedure call maps into an application executable call"
> 
> 
> This note details the implementation of Swift file input and output in
> application blocks; it is intended to be read in conjunction with a
> previous note 'How an app procedure call maps into an application call,
> from a Swift user perspective, attempting to avoid the mechanics inside
> Swift.'
> 
> 
> Swift executes application procedures on one or more //sites//.
> 
> Each site consists of:
> 
> * worker nodes. There is some //execution mechanism// through which the
> Swift client side executable can execute its //wrapper script// on those
> worker nodes. This is commonly GRAM or Falkon or coasters.
> 
> * a site-shared file system. This site shared filesystem is accessible
> through some //file transfer mechanism// from the Swift client side
> executable. This is commonly GridFTP or coasters. This site shared
> filesystem is also accessible through the posix file system on all worker
> nodes, mounted at the same location as seen through the file transfer
> mechanism. Swift is configured with the location of some //site working
> directory// on that site-shared file system.
> 
> There is no assumption that the site shared file system for one site is
> accessible from another site.
> 
> For each workflow run, on each site that is used by that run, a //run
> directory// is created in the site working directory, by the Swift client
> side.
> 
> In that run directory are placed several subdirectories:
> 
> * shared/ - site shared files cache
> 
> * kickstart/ - when kickstart is used, kickstart record files
> for each job that has generated a kickstart
> 
> * info/ - wrapper script log files
> 
> * status/ - job status files
> 
> * jobs/  //application workspace directories// (optionally placed here -
> see below)
> 
> Application execution looks like this:
> 
> For each application procedure call:
> 
> The Swift client side selects a site; copies the input files for that
> procedure call to the site shared file cache if they are not already in
> the cache, using the file transfer mechanism; and then invokes the wrapper
> script on that site using the execution mechanism.
> 
> The wrapper script creates the application workspace directory; places the
> input files for that job into the application workspace directory using
> either cp or ln -s (depending on a configuration option); executes the
> application unix executable; copies output files from the application
> workspace directory to the site shared directory using cp; creates a
> status file under the status/ directory; and exits, returning control to
> the Swift client side. Logs created during the execution of the wrapper
> script are stored under the info/ directory.
> 
> The Swift client side then checks for the presence of and deletes a status
> file indicating success; copies files from the site shared directory to
> the appropriate client side location.
> 
> The job directory is created (in the default mode) under the jobs/
> directory. However, it can be created under an arbitrary other path, which
> allows it to be created on a different file system (such as a worker node
> local file system in the case that the worker node has a local file
> system).
> 



More information about the Swift-devel mailing list