[Swift-user] Setting up Swift at Stanford
Michael Wilde
wilde at mcs.anl.gov
Tue Jun 4 08:26:00 CDT 2013
Hi Robert,
> One thing I couldn't get working is <filesystem provider="local"/>.
> When I have that in my sites.xml, I get
> ...
> Could not initialize shared directory on vsp-compute
> Caused by:
> org.globus.cog.abstraction.impl.file.FileResourceException: Failed
> to create directory:
> /home/rmcgibbo/.swiftwork/uname-20130603-2056-7octkf3a/shared
Is /home/rmcgibbo accessible to the local host on which you are running the "swift" command?
This error suggests that it is not. That is the likely problem. The swift command is trying to initialize it locally, and can't get to it.
As I mentioned in the commented sites file, "filesystem provider=local" assumes "...- That /scratch/rmcgibbo is mounted on both the Swift run host and on the remote PBS system.
Can you tell us what the filesystem configuration and sharing arrangement is among your multiple clusters? And the typically input and output file sizes and counts of the app() functions you expect to run? Then we can suggest various data management configuration strategies for you. For modest file sizes, say under 20MB, coaster provider staging is a good choice. To use provider staging, do this:
In a local swift.properties -config file (called "cf" here) set:
use.provider.staging=true
provider.staging.pin.swiftfiles=true
status.mode=provider
Also, for debugging set:
wrapperlog.always.transfer=true
sitedir.keep=true
execution.retries=0
lazy.errors=false
Then in your sites file, omit the filesystem tag, and specify a workdirectory on a fast compute node filesystem capable of handling the transient data volume for the number of jobs you expect the node to process concurrently. A typical choice is say /tmp or /scratch has sufficient space is:
<workdirectory>/tmp/{env.USER}/swiftwork</workdirectory>
The specify on your swift command:
swift -config cf -tc.file apps -sites.file sites.xml myscript.swift -myarg=etc ...
With this configuration, Swift will stage your data from local filesystems on the submit host to the compute node hosts, and back. After each job runs successfully on a compute node, its locally staged data will be removed. If the job fails, its data will be left there (e.g. for debugging) if you specify sitedir.keep=true.
Also, your github blog page on Swift looks great! One thing to note there: its almost always unnecessary and often causes trouble to explicitly set SWIFT_HOME. All you need to typically do is put the Swift release's bin/ dir in your PATH.
Regards,
- Mike
More information about the Swift-user
mailing list