[Swift-devel] notes on how swift implements file input and output
Ian Foster
foster at anl.gov
Thu Dec 4 20:39:21 CST 2008
Ben:
I don't know if these questions can easily be answered via email,
maybe we need to talk on the phone (are you in Australia next week?)
or they might be suitable for a next rev of these notes.
a) Am I correct in assuming that Swift currently will not run on a
site that does not support a shared file system?
b) Can we build on this document to introduce means by which we could
make use of methods such as bulk transfer of many input files,
collective I/O as on BG/P, etc.?
c) What are the pros and cons of copying all input and output files
twice, once to the site, and once to the node. Is this ever a source
of overhead?
Ian.
On Dec 1, 2008, at 4:00 PM, Ben Clifford wrote:
>
> read this in conjunction with previous note, "Subject: User
> perspective on
> how an app procedure call maps into an application executable call"
>
>
> This note details the implementation of Swift file input and output in
> application blocks; it is intended to be read in conjunction with a
> previous note 'How an app procedure call maps into an application
> call,
> from a Swift user perspective, attempting to avoid the mechanics
> inside
> Swift.'
>
>
> Swift executes application procedures on one or more //sites//.
>
> Each site consists of:
>
> * worker nodes. There is some //execution mechanism// through which
> the
> Swift client side executable can execute its //wrapper script// on
> those
> worker nodes. This is commonly GRAM or Falkon or coasters.
>
> * a site-shared file system. This site shared filesystem is accessible
> through some //file transfer mechanism// from the Swift client side
> executable. This is commonly GridFTP or coasters. This site shared
> filesystem is also accessible through the posix file system on all
> worker
> nodes, mounted at the same location as seen through the file transfer
> mechanism. Swift is configured with the location of some //site
> working
> directory// on that site-shared file system.
>
> There is no assumption that the site shared file system for one site
> is
> accessible from another site.
>
> For each workflow run, on each site that is used by that run, a //run
> directory// is created in the site working directory, by the Swift
> client
> side.
>
> In that run directory are placed several subdirectories:
>
> * shared/ - site shared files cache
>
> * kickstart/ - when kickstart is used, kickstart record files
> for each job that has generated a kickstart
>
> * info/ - wrapper script log files
>
> * status/ - job status files
>
> * jobs/ //application workspace directories// (optionally placed
> here -
> see below)
>
> Application execution looks like this:
>
> For each application procedure call:
>
> The Swift client side selects a site; copies the input files for that
> procedure call to the site shared file cache if they are not already
> in
> the cache, using the file transfer mechanism; and then invokes the
> wrapper
> script on that site using the execution mechanism.
>
> The wrapper script creates the application workspace directory;
> places the
> input files for that job into the application workspace directory
> using
> either cp or ln -s (depending on a configuration option); executes the
> application unix executable; copies output files from the application
> workspace directory to the site shared directory using cp; creates a
> status file under the status/ directory; and exits, returning
> control to
> the Swift client side. Logs created during the execution of the
> wrapper
> script are stored under the info/ directory.
>
> The Swift client side then checks for the presence of and deletes a
> status
> file indicating success; copies files from the site shared directory
> to
> the appropriate client side location.
>
> The job directory is created (in the default mode) under the jobs/
> directory. However, it can be created under an arbitrary other path,
> which
> allows it to be created on a different file system (such as a worker
> node
> local file system in the case that the worker node has a local file
> system).
>
> --
>
> _______________________________________________
> Swift-devel mailing list
> Swift-devel at ci.uchicago.edu
> http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
More information about the Swift-devel
mailing list