Programming Quickstart ----------------- This guide is intended to help you get started with Swift. You will use swift to generate data, both as named files on disc and as intermediate data and then pass it on to analytical tools. All the programs are deliberately simple because it is intended to show the Swift usage, not the analytical tool usage. Work the examples with the shell scripts then replace them with whatever analytical tool you want to use. Setup: Obtain the tutorial package. It is in the svn (directions in the Quickstart (www.ci.uchicago.edu/swift/guides/release-0.94/quickstart/quickstart.html)). Change directory into the created directory called "tutorial" This directory contains: bin: script tools for the tutorial scripts: Swift scripts for the tutorial Start: $ source setup.sh # to add tutorial programs to your PATH ...and then verify: $ random.sh # should be found in your PATH now The tutorial is arranged in parts. Begin: $ cd part01 $ cat README When finished: $ cd ../part02 # etc In each part, you can type "cleanup" after running a few swift scripts to remove the old logs that build up, and (usually) the output files. This tutorial takes you through a six step process that demonstrates how to effectively use Swift. It also demonstrates running it on a cluster. Naturally, configuration details will vary more in the demonstration than they did in the six step process. .p1.swift **************** ---- include::../../examples/quickstart/part01/p1.swift[] ---- **************** The first part is a basic swift program that calls a shell script random.sh . Notice no arguments were passed to the app, but we did use the return value. The return value came from being on the left hand side of the = when the app was called. Notice that although the output file is stored, no filename is given. The file is stored on disc and could be retrieved (it is in the _concurrent folder), but making the file anonymous (by doing what we did -- not giving it a name) should be done when the file is intended as intermediate data that will not be needed later. This next script shows how to name files. Not only are they easier to find, it is the appropriate style for files that are intended to persist outside the script. Now that we can choose our greeting text, we can call the same procedure with different parameters to generate several output files with different greetings. The code is in manyparam.swift and can be run as before using the swift command. .p2b.swift **************** ---- include::../../examples/quickstart/part02/p2b.swift[] ---- **************** When you look at this file you can see that the output file is specified. In this case it will be named sim.out in folder output. Technically, it doesn’t matter if you name files or not since you can find anonymous ones and named ones can be ignored, but searching for files you want and filtering files you don’t can impact productivity more than may be initially apparent. .p3.swift **************** ---- include::../../examples/quickstart/part03/p3.swift[] ---- **************** Here we illustrate the use of a loop to run things concurrently. Notice that we do not specify anything about how many instances are running or where they are running. This isn’t terribly important as we are running everything locally, but this may be extremely important as we begin to run on more resources. .p4.swift **************** ---- include::../../examples/quickstart/part04/p4.swift[] ---- **************** The exciting part here is that each output file is named. Notice the pattern that allows each file to have a unique name. This pattern can be modified, but some variation mechanism must be provided to prevent file name conflicts. .p5.swift **************** ---- include::../../examples/quickstart/part05/p5.swift[] ---- **************** This illustrates passing the output data on for additional analysis. This technique of specifying the next step can be repeated until the entire workflow is specified. .p6.swift **************** ---- include::../../examples/quickstart/part06/p6.swift[] ---- **************** This demonstrates a more complicated analysis program. Obviously, “complicated” is a relative term here, but the technique is the same whether we’re passing one parameter to a simple program or several to a heavyweight analytical tool. .p7.swift **************** ---- include::../../examples/quickstart/part07/p7.swift[] ---- **************** .tc **************** ---- include::../../examples/quickstart/part07/tc[] ---- **************** .sites.xml **************** ---- include::../../examples/quickstart/part07/sites.xml[] ---- **************** This shows the changes necessary to run this script on a cluster. The specific configuration will probably not be particularly useful to you unless you have access to this particular cluster, but it should give you an idea of what information is needed and what format it is expected in. Notice that none of the changes in the .swift file actually have anything to do with running on a cluster. The additional parameters are there to allow us to better see performance changes, not running on a server. The changes that allow us to run on the cluster are in the tc and sites files. We've now created a simple workflow with Swift. You can add programs to the workflow and manage their output. TO make them more complicate you need only call more advanced utilities and repeat the techniques you have learned. There are a lot more things you can learn to make Swift work better, and you will find those in the user manual. Go ahead and experiment with it a little.