[Swift-devel] data staging process/ documents?

Michael Wilde wilde at mcs.anl.gov
Thu Feb 12 10:21:26 CST 2009



On 2/11/09 7:07 PM, Allan Espinosa wrote:
> Hi,
> 
> I am attempting to actualize how collective operations on workflows
> (loosely-coupled) work in general.  My initial idea is that this goes
> in the staging of data before executing a task in a workflow.
> 
> Do we have documents describing these?

I think the email below from Ben is relevant, and referes to a prios 
post on swift-devel. Im not sure if that text has made it to the 
userguide yet.

- Mike


I have a small idea on how it
> works by monitoring my swift job's <workdir/> as a workflow executes.
> 
> My initial ideas are posted in
> http://www.ci.uchicago.edu/wiki/bin/view/VDS/DslCS/CollectiveIO
> 
> -Allan
> _______________________________________________
> Swift-devel mailing list
> Swift-devel at ci.uchicago.edu
> http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel




-------- Original Message --------
Subject: [Swift-devel] notes on how swift implements file input and output
Date: Mon, 1 Dec 2008 22:00:16 +0000 (GMT)
From: Ben Clifford <benc at hawaga.org.uk>
To: swift-devel at ci.uchicago.edu
References: <Pine.LNX.4.64.0812011854470.2448 at dildano.hawaga.org.uk>


read this in conjunction with previous note, "Subject: User perspective on
how an app procedure call maps into an application executable call"


This note details the implementation of Swift file input and output in
application blocks; it is intended to be read in conjunction with a
previous note 'How an app procedure call maps into an application call,
from a Swift user perspective, attempting to avoid the mechanics inside
Swift.'


Swift executes application procedures on one or more //sites//.

Each site consists of:

* worker nodes. There is some //execution mechanism// through which the
Swift client side executable can execute its //wrapper script// on those
worker nodes. This is commonly GRAM or Falkon or coasters.

* a site-shared file system. This site shared filesystem is accessible
through some //file transfer mechanism// from the Swift client side
executable. This is commonly GridFTP or coasters. This site shared
filesystem is also accessible through the posix file system on all worker
nodes, mounted at the same location as seen through the file transfer
mechanism. Swift is configured with the location of some //site working
directory// on that site-shared file system.

There is no assumption that the site shared file system for one site is
accessible from another site.

For each workflow run, on each site that is used by that run, a //run
directory// is created in the site working directory, by the Swift client
side.

In that run directory are placed several subdirectories:

* shared/ - site shared files cache

* kickstart/ - when kickstart is used, kickstart record files
for each job that has generated a kickstart

* info/ - wrapper script log files

* status/ - job status files

* jobs/  //application workspace directories// (optionally placed here -
see below)

Application execution looks like this:

For each application procedure call:

The Swift client side selects a site; copies the input files for that
procedure call to the site shared file cache if they are not already in
the cache, using the file transfer mechanism; and then invokes the wrapper
script on that site using the execution mechanism.

The wrapper script creates the application workspace directory; places the
input files for that job into the application workspace directory using
either cp or ln -s (depending on a configuration option); executes the
application unix executable; copies output files from the application
workspace directory to the site shared directory using cp; creates a
status file under the status/ directory; and exits, returning control to
the Swift client side. Logs created during the execution of the wrapper
script are stored under the info/ directory.

The Swift client side then checks for the presence of and deletes a status
file indicating success; copies files from the site shared directory to
the appropriate client side location.

The job directory is created (in the default mode) under the jobs/
directory. However, it can be created under an arbitrary other path, which
allows it to be created on a different file system (such as a worker node
local file system in the case that the worker node has a local file
system).

-- 

_______________________________________________
Swift-devel mailing list
Swift-devel at ci.uchicago.edu
http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel




More information about the Swift-devel mailing list