[Swift-devel] notes on how swift implements file input and output
Ben Clifford
benc at hawaga.org.uk
Mon Dec 1 16:00:16 CST 2008
read this in conjunction with previous note, "Subject: User perspective on
how an app procedure call maps into an application executable call"
This note details the implementation of Swift file input and output in
application blocks; it is intended to be read in conjunction with a
previous note 'How an app procedure call maps into an application call,
from a Swift user perspective, attempting to avoid the mechanics inside
Swift.'
Swift executes application procedures on one or more //sites//.
Each site consists of:
* worker nodes. There is some //execution mechanism// through which the
Swift client side executable can execute its //wrapper script// on those
worker nodes. This is commonly GRAM or Falkon or coasters.
* a site-shared file system. This site shared filesystem is accessible
through some //file transfer mechanism// from the Swift client side
executable. This is commonly GridFTP or coasters. This site shared
filesystem is also accessible through the posix file system on all worker
nodes, mounted at the same location as seen through the file transfer
mechanism. Swift is configured with the location of some //site working
directory// on that site-shared file system.
There is no assumption that the site shared file system for one site is
accessible from another site.
For each workflow run, on each site that is used by that run, a //run
directory// is created in the site working directory, by the Swift client
side.
In that run directory are placed several subdirectories:
* shared/ - site shared files cache
* kickstart/ - when kickstart is used, kickstart record files
for each job that has generated a kickstart
* info/ - wrapper script log files
* status/ - job status files
* jobs/ //application workspace directories// (optionally placed here -
see below)
Application execution looks like this:
For each application procedure call:
The Swift client side selects a site; copies the input files for that
procedure call to the site shared file cache if they are not already in
the cache, using the file transfer mechanism; and then invokes the wrapper
script on that site using the execution mechanism.
The wrapper script creates the application workspace directory; places the
input files for that job into the application workspace directory using
either cp or ln -s (depending on a configuration option); executes the
application unix executable; copies output files from the application
workspace directory to the site shared directory using cp; creates a
status file under the status/ directory; and exits, returning control to
the Swift client side. Logs created during the execution of the wrapper
script are stored under the info/ directory.
The Swift client side then checks for the presence of and deletes a status
file indicating success; copies files from the site shared directory to
the appropriate client side location.
The job directory is created (in the default mode) under the jobs/
directory. However, it can be created under an arbitrary other path, which
allows it to be created on a different file system (such as a worker node
local file system in the case that the worker node has a local file
system).
--
More information about the Swift-devel
mailing list