[Swift-devel] notes on how swift implements file input and output

Mon Dec 1 16:00:16 CST 2008

read this in conjunction with previous note, "Subject: User perspective on 
how an app procedure call maps into an application executable call"

This note details the implementation of Swift file input and output in 
application blocks; it is intended to be read in conjunction with a 
previous note 'How an app procedure call maps into an application call, 
from a Swift user perspective, attempting to avoid the mechanics inside 
Swift.'

Swift executes application procedures on one or more //sites//.

Each site consists of:

* worker nodes. There is some //execution mechanism// through which the 
Swift client side executable can execute its //wrapper script// on those 
worker nodes. This is commonly GRAM or Falkon or coasters.

* a site-shared file system. This site shared filesystem is accessible 
through some //file transfer mechanism// from the Swift client side 
executable. This is commonly GridFTP or coasters. This site shared 
filesystem is also accessible through the posix file system on all worker 
nodes, mounted at the same location as seen through the file transfer 
mechanism. Swift is configured with the location of some //site working 
directory// on that site-shared file system.

There is no assumption that the site shared file system for one site is 
accessible from another site.

For each workflow run, on each site that is used by that run, a //run 
directory// is created in the site working directory, by the Swift client 
side.

In that run directory are placed several subdirectories:

* shared/ - site shared files cache

* kickstart/ - when kickstart is used, kickstart record files 
for each job that has generated a kickstart

* info/ - wrapper script log files

* status/ - job status files

* jobs/  //application workspace directories// (optionally placed here - 
see below)

Application execution looks like this:

For each application procedure call:

The Swift client side selects a site; copies the input files for that 
procedure call to the site shared file cache if they are not already in 
the cache, using the file transfer mechanism; and then invokes the wrapper 
script on that site using the execution mechanism.

The wrapper script creates the application workspace directory; places the 
input files for that job into the application workspace directory using 
either cp or ln -s (depending on a configuration option); executes the 
application unix executable; copies output files from the application 
workspace directory to the site shared directory using cp; creates a 
status file under the status/ directory; and exits, returning control to
the Swift client side. Logs created during the execution of the wrapper 
script are stored under the info/ directory.

The Swift client side then checks for the presence of and deletes a status 
file indicating success; copies files from the site shared directory to 
the appropriate client side location.

The job directory is created (in the default mode) under the jobs/ 
directory. However, it can be created under an arbitrary other path, which 
allows it to be created on a different file system (such as a worker node 
local file system in the case that the worker node has a local file 
system).

--