[Swift-devel] Revised Quickstart
Eric Skogen
eskogen at g.clemson.edu
Fri May 10 09:03:27 CDT 2013
Attached is a revised Quickstart guide. Building it requires that the
UofC_2013-04-09 folder be moved to the examples folder and renamed
"quickstart"
We hope this will help new users become comfortable with swift more quickly.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/swift-devel/attachments/20130510/344b876a/attachment.html>
-------------- next part --------------
Swift Quickstart
================
[abstract]
Abstract
--------
This guide describes the steps needed to download, install, configure,
and run the basic examples for Swift. If you are using a pre-installed
version of Swift, you can skip directly to the configuration section.
Stable Releases vs. Development Releases
----------------------------------------
Stable releases of Swift have undergone more extensive testing than development releases.
In general, they are more stable, have fewer bugs, and have been tested on a variety of
systems.
The development version of Swift is aimed at developers and testers. The development
code has the highest chance of containing buggy and untested code. If you need stability
please use the latest stable release.
Downloading a Swift Distribution
--------------------------------
There are two main ways of getting the Swift implementation: binary
releases and the source repository.
Binary Releases
~~~~~~~~~~~~~~~
For the majority of users, downloading and installing binary releases is recommended.
Since Swift is written in Java, the binary packages will run on all supported platforms with
Java Runtime Environment 1.5 or greater. Binary releases can be obtained from the
http://www.ci.uchicago.edu/swift/downloads/index.php[Swift downloads page].
Once downloaded, simply unpack the downloaded package (swift-<version>.tar.gz) into a
directory of your choice:
-----
tar -xzvf swift-<version>.tar.gz
-----
This will create a swift-<version> directory containing the build.
Source Repository
~~~~~~~~~~~~~~~~~
The source code for Swift is available to developers who have an interest in contributing
new features. To build Swift from source code, you will need http://ant.apache.org/[Apache Ant]
and http://www.oracle.com/technetwork/java/javase/downloads/index.html[Java JDK]. Once
built, the dist/swift-svn directory will contain your build.
To download and build Swift 0.93, follow these instructions:
-----
$ mkdir swift-0.93
$ cd swift-0.93
$ svn co https://cogkit.svn.sourceforge.net/svnroot/cogkit/branches/4.1.9/src/cog
$ cd cog/modules
$ svn co https://svn.ci.uchicago.edu/svn/vdl2/branches/release-0.93 swift
$ cd swift
$ ant redist
-----
Setting your PATH
-----------------
Once Swift is installed, it is useful to add the swift binary to your PATH
environment variable. To do this, first determine where the Swift bin
directory is located. If you installed Swift from a binary release, it will
be in the swift-0.93/bin directory where you installed it. If you followed
the instructions above for installing Swift from a source repository, it
will be located in swift-0.93/cog/modules/swift/dist/swift-svn/bin.
Add the following line to the bottom of ~/.bashrc:
-----
export PATH=$PATH:/full/path/to/swift
-----
When you login, test this out by typing the command
-----
$ which swift
-----
This should point you to the path of the Swift binary.
Running Swift Examples
----------------------
This portion is intended to help you get started with Swift. You will use swift to generate data, both as named files on disc and as intermediate data and then pass it on to analytical tools. All the programs are deliberately simple because it is intended to show the Swift usage, not the analytical tool usage. Work the examples with the shell scripts then replace them with whatever analytical tool you want to use.
Change directory into the created directory called "tutorial"
This directory contains:
bin: script tools for the tutorial
scripts: Swift scripts for the tutorial
Start:
-----
$ source setup.sh # to add tutorial programs to your PATH ...and then verify:
$ random.sh # should be found in your PATH now The tutorial is arranged in parts.
-----
Begin:
-----
$ cd part01
$ cat README
-----
When finished: $ cd ../part02 # etc
In each part, you can type "cleanup" after running a few swift scripts to remove the old logs that build up, and (usually) the output files.
This tutorial takes you through a six step process that demonstrates how to effectively use Swift. It also demonstrates running it on a cluster. Naturally, configuration details will vary more in the demonstration than they did in the six step process.
.p1.swift
****************
----
include::../../examples/quickstart/part01/p1.swift[]
----
****************
The first part is a basic swift program that calls a shell script random.sh . Notice no arguments were passed to the app, but we did use the return value. The return value came from being on the left hand side of the = when the app was called.
Notice that although the output file is stored, no filename is given. The file is stored on disc and could be retrieved (it is in the _concurrent folder), but making the file anonymous (by doing what we did -- not giving it a name) should be done when the file is intended as intermediate data that will not be needed later. This next script shows how to name files. Not only are they easier to find, it is the appropriate style for files that are intended to persist outside the script.
Now that we can choose our greeting text, we can call the same procedure
with different parameters to generate several output files with
different greetings. The code is in manyparam.swift and can be run as
before using the swift command.
.p2b.swift
****************
----
include::../../examples/quickstart/part02/p2b.swift[]
----
****************
When you look at this file you can see that the output file is specified. In this case it will be named sim.out in folder output. Technically, it doesn?t matter if you name files or not since you can find anonymous ones and named ones can be ignored, but searching for files you want and filtering files you don?t can impact productivity more than may be initially apparent.
.p3.swift
****************
----
include::../../examples/quickstart/part03/p3.swift[]
----
****************
Here we illustrate the use of a loop to run things concurrently. Notice that we do not specify anything about how many instances are running or where they are running. This isn?t terribly important as we are running everything locally, but this may be extremely important as we begin to run on more resources.
.p4.swift
****************
----
include::../../examples/quickstart/part04/p4.swift[]
----
****************
The noteworthy part here is that each output file is named. This is done by inserting a variable (i in this case) into each file to make sure each haves a unique name. Although good, this pattern can be modified as long as something must be done to make each name unique.
.p5.swift
****************
----
include::../../examples/quickstart/part05/p5.swift[]
----
****************
This illustrates passing the output data on for additional analysis. This technique of specifying the next step can be repeated until the entire workflow is specified.
.p6.swift
****************
----
include::../../examples/quickstart/part06/p6.swift[]
----
****************
This demonstrates a more complicated analysis program. Obviously, ?complicated? is a relative term here, but the technique is the same whether we?re passing one parameter to a simple program or several to a heavyweight analytical tool.
.p7.swift
****************
----
include::../../examples/quickstart/part07/p7.swift[]
----
****************
.tc
****************
----
include::../../examples/quickstart/part07/tc[]
----
****************
.sites.xml
****************
----
include::../../examples/quickstart/part07/sites.xml[]
----
****************
This shows the changes necessary to run this script on a cluster. The specific configuration will probably not be directly useable unless you have access to this particular cluster, but it should give you an idea of what information is needed and what format it is expected in.
Notice that none of the changes in the .swift file actually have anything to do with running on a cluster. The additional parameters are there to allow us to better see performance changes, not running on a server. The changes that allow us to run on the cluster are in the tc and sites files.
We've now created a simple workflow with Swift. You can add programs to a workflow and manage their output. To make them more complicate you need only call more advanced utilities and repeat the techniques you have learned. There are a lot more things you can learn to make Swift work better, and you will find those in the user manual. Go ahead and experiment with it a little.
More documentation on how to run Swift can be found at
http://www.ci.uchicago.edu/swift/docs/index.php.
More information about the Swift-devel
mailing list