[Swift-user] Setting up Swift at Stanford
Michael Wilde
wilde at mcs.anl.gov
Mon Jun 3 22:38:55 CDT 2013
I forgot to also mention: the example below with the "ssh-cl" ("ssh command line") provider also assumes that you can do a password-less ssh command from your workstation to your PBS head node. Ie, that you have ssh keys in place on the head node and that youre using an ssh agent.
The standard Swift ssh provider (eg using provider=coaster jobmanager=ssh:pbs) uses a file called $HOME/.ssh/auth.defaults to specify ssh passwords or passphrases, or for better security swift will prompt for these.
We tend to use and recommend the newer ssh-cl for both security and convenience.
- Mike
----- Original Message -----
> From: "Michael Wilde" <wilde at mcs.anl.gov>
> To: "Robert McGibbon" <rmcgibbo at gmail.com>
> Cc: swift-user at ci.uchicago.edu
> Sent: Monday, June 3, 2013 10:27:45 PM
> Subject: Re: [Swift-user] Setting up Swift at Stanford
>
> Hi Robert,
>
> To run swift from a workstation that can ssh to one or more cluster
> head nodes, use a sites file like this:
>
> <pool handle="vsp-compute">
> <execution provider="coaster" jobmanager="ssh-cl:pbs"
> url="vsp-compute-01.stanford.edu"/>
> <profile namespace="globus" key="jobsPerNode">1</profile>
> <profile namespace="globus" key="lowOverAllocation">100</profile>
> <profile namespace="globus"
> key="highOverAllocation">100</profile>
> <profile namespace="globus" key="maxtime">3600</profile>
> <profile namespace="globus" key="maxWalltime">00:05:00</profile>
> <profile namespace="globus" key="queue">default</profile>
> <profile namespace="globus" key="slots">5</profile>
> <profile namespace="globus" key="maxnodes">1</profile>
> <profile namespace="globus" key="nodeGranularity">1</profile>
> <profile namespace="karajan" key="jobThrottle">1.00</profile>
> <profile namespace="karajan" key="initialScore">10000</profile>
> <workdirectory>/scratch/rmcgibbo/swiftwork</workdirectory>
> </pool>
>
> This specifies that Swift should:
>
> - use the "coaster" provider, which enables Swift to ssh to another
> system and qsub from there:
>
> <execution provider="coaster" jobmanager="ssh-cl:pbs"
> url="vsp-compute-01.stanford.edu"/>
>
> - run up to 100 Swift app() tasks in parallel on the remote system:
>
> <profile namespace="karajan" key="jobThrottle">1.00</profile>
> <profile namespace="karajan" key="initialScore">10000</profile>
>
> - app() tasks should be limited to 5 minutes walltime:
>
> <profile namespace="globus" key="maxWalltime">00:05:00</profile>
>
> - app() tasks will be run within PBS coaster "pilot" jobs. Each PBS
> job should have a walltime of 750 seconds:
>
> <profile namespace="globus" key="lowOverAllocation">100</profile>
> <profile namespace="globus" key="highOverAllocation">100</profile>
> <profile namespace="globus" key="maxtime">750</profile>
>
> - Up to 5 concurrent PBS coaster jobs each asking for 1 node will be
> submitted to the default queue:
>
> <profile namespace="globus" key="queue">default</profile>
> <profile namespace="globus" key="slots">5</profile>
> <profile namespace="globus" key="maxnodes">1</profile>
> <profile namespace="globus" key="nodeGranularity">1</profile>
>
> - Swift should run only one app() task at a time within each PBS job
> slot:
>
> <profile namespace="globus" key="jobsPerNode">1</profile>
>
> - On the remote PBS cluster, create per-run directories under this
> work directory:
>
> <workdirectory>/scratch/rmcgibbo/swiftwork</workdirectory>
>
> - And stage data to the site by using local copy operations:
>
> <filesystem provider="local"/>
>
> You can make the sites.xml entry more user-independent using, e.g.:
>
> <workdirectory>/scratch/{env.USER}/swiftwork</workdirectory>
>
> The overall sites entry above assumes:
>
> - That /scratch/rmcgibbo is mounted on both the Swift run host and on
> the remote PBS system.
>
> If there is no common shared filesystem, Swift can use a data
> transport technique called "coaster provider staging" to move the
> data for you. This is specified in the swift.properties file.
>
> In many cases, with a shared filesystem bewteen the Swift client host
> and the execution cluster, its desirable to turn off staging
> altogether. This is done using a mode called "direct" data
> management (see
> http://www.ci.uchicago.edu/swift/guides/trunk/userguide/userguide.html#_collective_data_management.
> This is being simplified for future releases.)
>
> - That each PBS job is given one CPU core, not one full node.
>
> The PBS ppn attribute can be specified to request a specific number
> of cores (processors) per node:
>
> <profile namespace="globus" key="ppn">16</profile>
>
> ...and then that each coaster pilot job should run up to 16 Swift
> app() tasks at once:
>
> <profile namespace="globus" key="jobsPerNode">16</profile>
>
> For more info on coasters, see:
> http://www.ci.uchicago.edu/swift/guides/trunk/userguide/userguide.html#_coasters
> and: http://www.ci.uchicago.edu/swift/papers/UCC-coasters.pdf
>
> For more examples on site configurations, see:
>
> http://www.ci.uchicago.edu/swift/guides/trunk/siteguide/siteguide.html
>
> And lastly, note that in your initial sites.xml below:
>
> - Omitting the filesystem provider tag is typically only done when
> "use.provider.staging" is specified in the swift.properties config
> file
>
> - The stagingMethod tag only applies to provider staging.
>
> We're working hard to document all this better and provider a better
> set of illustrated examples and templates for common site
> configurations. In the meantime, we'll help you create a set of
> useful configurations for your site(s).
>
> Regards,
>
> - Mike
>
> > We just heard about the swift project from some colleagues at U
> > Chicago, and we're interested in trying it out with some of our
> > compute resources at Stanford to run parallel molecular dynamics
> > and
> > x-ray scatting simulations. Currently, I'm most interested in
> > setting up the environment such that I can submit my swift script
> > on
> > a local workstation, with execution on a few different clusters.
> > The
> > head nodes of our local clusters are accessible via ssh, and then
> > job execution is scheduled with pbs.
> >
> > When I run swift, it can't seem to find qsub on the cluster.
> >
> > rmcgibbo at Roberts-MacBook-Pro-2 ~/projects/swift
> > $ swift -sites.file sites.xml hello.swift -tc.file tc.data
> > Swift 0.94 swift-r6492 cog-r3658
> >
> > RunID: 20130603-1704-5xii8svc
> > Progress: time: Mon, 03 Jun 2013 17:04:10 -0700
> > 2013-06-03 17:04:10.735 java[77051:1f07] Loading Maximizer into
> > bundle: com.apple.javajdk16.cmd
> > 2013-06-03 17:04:11.410 java[77051:1f07] Maximizer: Unsupported
> > window created of class: CocoaAppWindow
> > Progress: time: Mon, 03 Jun 2013 17:04:13 -0700 Stage in:1
> > Execution failed:
> > Exception in uname:
> > Arguments: [-a]
> > Host: vsp-compute
> > Directory: hello-20130603-1704-5xii8svc/jobs/y/uname-ydyn5fal
> > Caused by:
> > Cannot submit job: Cannot run program "qsub": error=2, No such file
> > or directory
> > uname, hello.swift, line 8
> >
> > When I switch the execution provider from pbs to ssh, the hob runs
> > successfully, but only on the head node of the vsp-compute cluster.
> > I'd like to run instead using the cluster's pbs queue. Any help
> > would be greatly appreciated.
> >
> > -Robert
> > Graduate Student, Pande Lab
> > Stanford University, Department of Chemistry
> >
> > p.s.
> >
> > My sitess.xml file is
> > ```
> > <config>
> > <pool handle="vsp-compute">
> > <filesystem provider="ssh" url=" vsp-compute-01.stanford.edu "/>
> > <execution provider="pbs" jobmanager="ssh:pbs" url="
> > vsp-compute-01.stanford.edu "/>
> >
> > <profile namespace="globus" key="maxtime">750</profile>
> > <profile namespace="globus" key="jobsPerNode">1</profile>
> > <profile namespace="globus" key="queue">default</profile>
> > <profile namespace="swift" key="stagingMethod">file</profile>
> >
> > <workdirectory>/scratch/rmcgibbo/swiftwork</workdirectory>
> > </pool>
> >
> > <!-- End -->
> > </config>
> > ```
> >
> > My SwiftScript is
> > ```
> > #hello.swift
> > type file;
> >
> > app (file o) uname() {
> > uname "-a" stdout=@o;
> > }
> > file outfile <"uname.txt">;
> >
> > outfile = uname();
> > ```
> > _______________________________________________
> > Swift-user mailing list
> > Swift-user at ci.uchicago.edu
> > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user
> _______________________________________________
> Swift-user mailing list
> Swift-user at ci.uchicago.edu
> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user
>
More information about the Swift-user
mailing list