[Swift-devel] notes on how swift implements file input and output

Ian Foster foster at anl.gov
Thu Dec 4 20:39:21 CST 2008


Ben:

I don't know if these questions can easily be answered via email,  
maybe we need to talk on the phone (are you in Australia next week?)  
or they might be suitable for a next rev of these notes.

a) Am I correct in assuming that Swift currently will not run on a  
site that does not support a shared file system?

b) Can we build on this document to introduce means by which we could  
make use of methods such as bulk transfer of many input files,  
collective I/O as on BG/P, etc.?

c) What are the pros and cons of copying all input and output files  
twice, once to the site, and once to the node. Is this ever a source  
of overhead?

Ian.


On Dec 1, 2008, at 4:00 PM, Ben Clifford wrote:

>
> read this in conjunction with previous note, "Subject: User  
> perspective on
> how an app procedure call maps into an application executable call"
>
>
> This note details the implementation of Swift file input and output in
> application blocks; it is intended to be read in conjunction with a
> previous note 'How an app procedure call maps into an application  
> call,
> from a Swift user perspective, attempting to avoid the mechanics  
> inside
> Swift.'
>
>
> Swift executes application procedures on one or more //sites//.
>
> Each site consists of:
>
> * worker nodes. There is some //execution mechanism// through which  
> the
> Swift client side executable can execute its //wrapper script// on  
> those
> worker nodes. This is commonly GRAM or Falkon or coasters.
>
> * a site-shared file system. This site shared filesystem is accessible
> through some //file transfer mechanism// from the Swift client side
> executable. This is commonly GridFTP or coasters. This site shared
> filesystem is also accessible through the posix file system on all  
> worker
> nodes, mounted at the same location as seen through the file transfer
> mechanism. Swift is configured with the location of some //site  
> working
> directory// on that site-shared file system.
>
> There is no assumption that the site shared file system for one site  
> is
> accessible from another site.
>
> For each workflow run, on each site that is used by that run, a //run
> directory// is created in the site working directory, by the Swift  
> client
> side.
>
> In that run directory are placed several subdirectories:
>
> * shared/ - site shared files cache
>
> * kickstart/ - when kickstart is used, kickstart record files
> for each job that has generated a kickstart
>
> * info/ - wrapper script log files
>
> * status/ - job status files
>
> * jobs/  //application workspace directories// (optionally placed  
> here -
> see below)
>
> Application execution looks like this:
>
> For each application procedure call:
>
> The Swift client side selects a site; copies the input files for that
> procedure call to the site shared file cache if they are not already  
> in
> the cache, using the file transfer mechanism; and then invokes the  
> wrapper
> script on that site using the execution mechanism.
>
> The wrapper script creates the application workspace directory;  
> places the
> input files for that job into the application workspace directory  
> using
> either cp or ln -s (depending on a configuration option); executes the
> application unix executable; copies output files from the application
> workspace directory to the site shared directory using cp; creates a
> status file under the status/ directory; and exits, returning  
> control to
> the Swift client side. Logs created during the execution of the  
> wrapper
> script are stored under the info/ directory.
>
> The Swift client side then checks for the presence of and deletes a  
> status
> file indicating success; copies files from the site shared directory  
> to
> the appropriate client side location.
>
> The job directory is created (in the default mode) under the jobs/
> directory. However, it can be created under an arbitrary other path,  
> which
> allows it to be created on a different file system (such as a worker  
> node
> local file system in the case that the worker node has a local file
> system).
>
> -- 
>
> _______________________________________________
> Swift-devel mailing list
> Swift-devel at ci.uchicago.edu
> http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel




More information about the Swift-devel mailing list