[Swift-commit] r6959 - SwiftTutorials/CIC_2013-08-09/doc
davidk at ci.uchicago.edu
davidk at ci.uchicago.edu
Thu Aug 22 14:55:28 CDT 2013
Author: davidk
Date: 2013-08-22 14:55:28 -0500 (Thu, 22 Aug 2013)
New Revision: 6959
Modified:
SwiftTutorials/CIC_2013-08-09/doc/README
SwiftTutorials/CIC_2013-08-09/doc/build_docs.sh
Log:
Updated build_docs for line numbers of sourced files
Include swift, tc, and app files directly
Updated descriptions for part numbers
Modified: SwiftTutorials/CIC_2013-08-09/doc/README
===================================================================
--- SwiftTutorials/CIC_2013-08-09/doc/README 2013-08-22 18:01:36 UTC (rev 6958)
+++ SwiftTutorials/CIC_2013-08-09/doc/README 2013-08-22 19:55:28 UTC (rev 6959)
@@ -9,22 +9,16 @@
p1 - Run an application under Swift
-p2 - Mapping (naming) output files
+p2 - Parallel loops with foreach
-p3 - Parallel loops with foreach
+p3 - Merging/reducing the results of a parallel foreach loop
-p4 - Mapping arrays to files
+p4 - Running on the remote site nodes
-p5 - merging/reducing the results of a parallel foreach loop
+p5 - Running the stats summary step on the remote site
-p6 - Sending arguments to applications
+p6 - Add additional apps for generating seeds remotely
-p7 - Running on the remote site nodes
-
-p8 - Running the stats summary step on the remote site
-
-p9 - A more complex workflow pattern: multiple parallel pipelines
-
////
@@ -35,7 +29,7 @@
Check out scripts from SVN
~~~~~~~~~~~~~~~~~~~~~~~~~~
-To checkout the most recent ATPESC tutorial scripts from SVN, run the following
+To checkout the most recent CIC tutorial scripts from SVN, run the following
command:
-----
@@ -52,21 +46,21 @@
-----
$ cd tutorial # change to the newly created tutorial directory
-$ source setup.sh # sets swift config files in $HOME/.swift
+$ source setup.sh <SITE> # sets swift config files in $HOME/.swift
$ java -version # verify that you have Oracle JAVA (prefered; 1.6 or later)
$ swift -version # verify that Swift 0.94 is in your $PATH and functional
-----
-NOTE: If you re-login or create additional terminal sessions, you must re-run `source setup.sh` in each one.
+NOTE: If you re-login, you will need to re-run source setup.sh.
Simple "science applications" for the workflow tutorial
-------------------------------------------------------
-We use two shell scripts in this tutorial to serve as very simple stand-ins for a real science application:
-`simulation.sh` and `stats.sh`. These are located in the `tutorial/app` directory.
+There are two shell scripts included that serve a very simple stand-ins for science application:
+simulation.sh and stats.sh
simulation.sh
-~~~~~~~~~~~~~
-The simulation.sh script serves as a trivial substitute for a complex scientific simulation application. It generates and prints a set of one or more random integers in the range [0-2^62) as controlled by its optional arguments, which are:
+~~~~~~~~~~~~
+The simulation.sh script serves as a trivial substitute for a complex scientific simulation application. It generates and prints a set of one or more random integers in the range [0-2^32) as controlled by its optional arguments, which are:
-----
$ ./app/simulate.sh --help
@@ -137,24 +131,30 @@
336000
20000
320000
-real 0m3.012s
-user 0m0.005s
-sys 0m0.006s
+real 0m3.012s
+user 0m0.005s
+sys 0m0.006s
$
-----
+
stats.sh
-~~~~~~~~
+~~~~~~~
The stats.sh script reads a file containing n numbers and prints the average
of those numbers to stdout.
Introductory exercises
----------------------
-Parts 1-6 (p1.swift - p6.swift) run locally and serve as examples of the Swift language.
-Parts 7-9 (p7.swift - p9.swift) submit jobs to the site specified the setup stage
+Parts 1-3 (p1.swift - p3.swift) run locally and serve as examples of the Swift language.
+Parts 4-6 (p4.swift - p6.swift) submit jobs to the site specified the setup stage
+Introductory exercises
+----------------------
+Parts 1-3 (p1.swift - p3.swift) run locally and serve as examples of the Swift language.
+Parts 4-6 (p4.swift - p6.swift) submit jobs to the site specified the setup stage
+
p1 - Run an application under Swift
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The first swift script, p1.swift, runs simulate.sh to generate a single random
@@ -163,17 +163,29 @@
image:p1.png[]
.p1.swift
+[source, txt]
-----
-type file;
+include::../part01/p1.swift[]
+-----
-app (file o) mysim ()
-{
- simulate stdout=@filename(o);
-}
+The sites.xml file included in each part directory gives Swift information about the machines we will be running on.
+It defines things like the work directory, the scheduler to use, and how to control parallelism. The sites.xml file
+below will tell Swift to run on the local machine only, and run just 1 task at a time.
-file f = mysim();
+.sites.xml
+[source, txt]
-----
+include::../part01/sites.xml[]
+-----
+The app file translates from a Swift app function to the path of an executable on the file system.
+In this case, it translates from "simulate" to simulate.sh and assumes that simulate.sh will
+be available in your $PATH.
+[source, txt]
+-----
+include::../part01/apps[]
+-----
+
To run this script, run the following command:
-----
$ cd part01
@@ -191,24 +203,18 @@
$ ./clean.sh
------
-p2 - Mapping (naming) output files
+p2 - Parallel loops with foreach
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-The second swift script shows an example of naming the file. The output is now
-in a file called sim.out.
+The p2.swift script introduces the foreach loop. This script runs many
+simulations. The script also shows an example of naming the files. The output files
+are now called sim_N.out.
image:p2.png[]
.p2.swift
+[source, txt]
-----
-type file;
-
-app (file o) mysim ()
-{
- simulate stdout=@filename(o);
-}
-
-file f <"sim.out">;
-f = mysim();
+include::../part02/p2.swift[]
-----
To run the script:
@@ -217,26 +223,17 @@
$ swift p2.swift
-----
-p3 - Parallel loops with foreach
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-The p3.swift script introduces the foreach loop. This script runs many
-simulations. Output files are named here by Swift and will get created
-in the _concurrent directory.
+p3 - Merging/reducing the results of a parallel foreach loop
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+p3.swift introduces a postprocessing step. After all the parallel simulations have completed, the files
+created by simulation.sh will be averaged by stats.sh.
image:p3.png[]
.p3.swift
+[source, txt]
----
-type file;
-
-app (file o) mysim ()
-{
- simulate stdout=@filename(o);
-}
-
-foreach i in [0:9] {
- file f = mysim();
-}
+include::../part03/p3.swift[]
----
To run:
@@ -245,25 +242,19 @@
$ swift p3.swift
----
-p4 - Mapping arrays to files
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-p4.swift gives an example of naming multiple files within a foreach loop.
+p4 - Running on the remote site nodes
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+p4.swift is the first script that will submit jobs to remote site nodes for analysis.
+It is similar to earlier scripts, with a few minor exceptions. To generalize the script
+for other types of remote execution (e.g., when no shared filesystem is available to the compute nodes), the application simulate.sh
+will get transferred to the worker node by Swift, in the same manner as any other input data file.
image:p4.png[]
.p4.swift
+[source, txt]
----
-type file;
-
-app (file o) mysim ()
-{
- simulate stdout=@filename(o);
-}
-
-foreach i in [0:9] {
- file f <single_file_mapper; file=@strcat("output/sim_",i,".out")>;
- f = mysim();
-}
+include::../part04/p4.swift[]
----
To run:
@@ -273,39 +264,18 @@
Output files will be named output/sim_N.out.
-p5 - merging/reducing the results of a parallel foreach loop
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+p5 - Running the stats summary step on the remote site
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
p5.swift introduces a postprocessing step. After all the parallel simulations have completed, the files
-created by simulation.sh will be averaged by stats.sh.
+created by simulation.sh will be averaged by stats.sh. This is similar to p3, but all app invocations
+are done on remote nodes with Swift managing file transfers.
image:p5.png[]
.p5.swift
+[source, txt]
----
-type file;
-
-app (file o) mysim ()
-{
- simulate stdout=@filename(o);
-}
-
-app (file o) analyze (file s[])
-{
- stats @filenames(s) stdout=@filename(o);
-}
-
-file sims[];
-
-int nsim = @toInt(@arg("nsim","10"));
-
-foreach i in [0:nsim-1] {
- file simout <single_file_mapper; file=@strcat("output/sim_",i,".out")>;
- simout = mysim();
- sims[i] = simout;
-}
-
-file stats<"output/average.out">;
-stats = analyze(sims);
+include::../part05/p5.swift[]
----
To run:
@@ -313,227 +283,41 @@
$ swift p5.swift
----
-p6 - Sending arguments to applications
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-p6.swift introduces command line arguments. The script sets a variable called
-"steps" here, which determines the length of time that the simulation.sh
-will run for. It also defines a variable called nsim, which determines the
-number of simulations to run.
+p6 - Add additional apps and randomness
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+p6.swift build on p5.swift, but adds new apps for generating a random
+seed and a random bias value.
image:p6.png[]
.p6.swift
+[source, txt]
----
-type file;
-
-app (file o) mysim (int timesteps)
-{
- simulate timesteps stdout=@filename(o);
-}
-
-app (file o) analyze (file s[])
-{
- stats @filenames(s) stdout=@filename(o);
-}
-
-file sims[];
-int nsim = @toInt(@arg("nsim","10"));
-int steps = @toInt(@arg("steps","1"));
-
-foreach i in [0:nsim-1] {
- file simout <single_file_mapper; file=@strcat("output/sim_",i,".out")>;
- simout = mysim(steps);
- sims[i] = simout;
-}
-
-file stats<"output/average.out">;
-stats = analyze(sims);
+include::../part06/p6.swift[]
----
-Use the command below to specify the time for each simulation.
-----
-$ cd ../part06
-$ swift p6.swift -steps=3 # each simulation takes 3 seconds
-----
+In order to run on the cluster, sites.xml needed to be modified. Here is
+the new sites.xml we are using for this example. Note the changes between the sites.xml file
+in this example which uses condor, and the sites.xml file in part 1, which runs locally.
-p7 - Running on the remote site nodes
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-p7.swift is the first script that will submit jobs to remote site nodes for analysis.
-It is similar to earlier scripts, with a few minor exceptions. To generalize the script
-for other types of remote execution (e.g., when no shared filesystem is available to the compute nodes), the application simulate.sh
-will get transferred to the worker node by Swift, in the same manner as any other input data file.
-
-image:p7.png[]
-
-.p7.swift
+[source, txt]
-----
-type file;
-
-# Application to be called by this script
-
-file simulation_script <"simulate.sh">;
-
-# app() functions for application programs to be called:
-
-app (file out) simulation (file script, int timesteps, int sim_range)
-{
- sh @filename(script) timesteps sim_range stdout=@filename(out);
-}
-
-# Command line params to this script:
-
-int nsim = @toInt(@arg("nsim", "10")); # number of simulation programs to run
-int range = @toInt(@arg("range", "100")); # range of the generated random numbers
-
-# Main script and data
-
-int steps=3;
-
-tracef("\n*** Script parameters: nsim=%i steps=%i range=%i \n\n", nsim, steps, range);
-
-foreach i in [0:nsim-1] {
- file simout <single_file_mapper; file=@strcat("output/sim_",i,".out")>;
- simout = simulation(simulation_script, steps, range);
-}
+include::../part06/sites.xml[]
-----
-To run:
-----
-$ cd ../part07
-$ swift p7.swift
-----
-
-p8 - Running the stats summary step on the remote site
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-p8.swift will also stage in and run stats.sh to calculate averages. It adds a
-trace statement so you can see the order in which things execute.
-
-image:p8.png[]
-
-.p8.swift
+Below is the updated apps file. Since Swift is staging shell scripts remotely to nodes on the cluster,
+the only application it needs defined here is the shell.
+[source, txt]
-----
-type file;
-
-# Applications to be called by this script
-
-file simulation_script <"simulate.sh">;
-file analysis_script <"stats.sh">;
-
-# app() functions for application programs to be called:
-
-app (file out) simulation (file script, int timesteps, int sim_range, file bias_file, int scale, int sim_count)
-{
- sh @filename(script) timesteps sim_range @filename(bias_file) scale sim_count stdout=@filename(out);
-}
-
-app (file out) analyze (file script, file s[])
-{
- sh @script @filenames(s) stdout=@filename(out);
-}
-
-# Command line params to this script:
-
-int nsim = @toInt(@arg("nsim", "10")); # number of simulation programs to run
-int steps = @toInt(@arg("steps", "1")); # number of "steps" each simulation (==seconds of runtime)
-int range = @toInt(@arg("range", "100")); # range of the generated random numbers
-int count = @toInt(@arg("count", "10")); # number of random numbers generated per simulation
-
-# Main script and data
-
-tracef("\n*** Script parameters: nsim=%i steps=%i range=%i count=%i\n\n", nsim, steps, range, count);
-
-file sims[]; # Array of files to hold each simulation output
-file bias<"bias.dat">; # Input data file to "bias" the numbers:
- # 1 line: scale offset ( N = n*scale + offset)
-foreach i in [0:nsim-1] {
- file simout <single_file_mapper; file=@strcat("output/sim_",i,".out")>;
- simout = simulation(simulation_script, steps, range, bias, 100000, count);
- sims[i] = simout;
-}
-
-file stats<"output/stats.out">; # Final output file: average of all "simulations"
-stats = analyze(analysis_script,sims);
+include::../part06/apps[]
-----
-To run:
+Use the command below to specify the time for each simulation.
----
-$ cd ../part08
-$ swift p8.swift
+$ cd ../part06
+$ swift p6.swift -steps=3 # each simulation takes 3 seconds
----
-p9 - A more complex workflow pattern: multiple parallel pipelines
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-p9.swift adds another app function called genrand. Genrand will produce a random
-number that will be used to determine how long each simulation app will run.
-
-image:p9.png[]
-
-.p9.swift
------
-type file;
-
-# Applications to be called by this script
-
-file simulation_script <"simulate.sh">;
-file analysis_script <"stats.sh">;
-
-# app() functions for application programs to be called:
-
-app (file out) genrand (file script, int timesteps, int sim_range)
-{
- sh @filename(script) timesteps sim_range stdout=@filename(out);
-}
-
-app (file out) simulation (file script, int timesteps, int sim_range, file bias_file, int scale, int sim_count)
-{
- sh @filename(script) timesteps sim_range @filename(bias_file) scale sim_count stdout=@filename(out);
-}
-
-app (file out) analyze (file script, file s[])
-{
- sh @script @filenames(s) stdout=@filename(out);
-}
-
-# Command line params to this script:
-int nsim = @toInt(@arg("nsim", "10")); # number of simulation programs to run
-int range = @toInt(@arg("range", "100")); # range of the generated random numbers
-int count = @toInt(@arg("count", "10")); # number of random numbers generated per simulation
-
-# Main script and data
-
-tracef("\n*** Script parameters: nsim=%i range=%i count=%i\n\n", nsim, range, count);
-
-file bias<"dynamic_bias.dat">; # Dynamically generated bias for simulation ensemble
-
-bias = genrand(simulation_script, 1, 1000);
-
-file sims[]; # Array of files to hold each simulation output
-
-foreach i in [0:nsim-1] {
-
- int steps = readData(genrand(simulation_script, 1, 5));
- tracef(" for simulation[%i] steps=%i\n", i, steps+1);
-
- file simout <single_file_mapper; file=@strcat("output/sim_",i,".out")>;
- simout = simulation(simulation_script, steps+1, range, bias, 100000, count);
- sims[i] = simout;
-}
-
-file stats<"output/stats.out">; # Final output file: average of all "simulations"
-stats = analyze(analysis_script,sims);
------
-
-To run:
-----
-$ cd ../part09
-$ swift p9.swift
-----
-
-
-
-
-
-
Running Swift scripts on Cloud resources
----------------------------------------
Modified: SwiftTutorials/CIC_2013-08-09/doc/build_docs.sh
===================================================================
--- SwiftTutorials/CIC_2013-08-09/doc/build_docs.sh 2013-08-22 18:01:36 UTC (rev 6958)
+++ SwiftTutorials/CIC_2013-08-09/doc/build_docs.sh 2013-08-22 19:55:28 UTC (rev 6959)
@@ -10,5 +10,4 @@
popd >& /dev/null
fi
-asciidoc -a toc -a toplevels=2 -a stylesheet=$PWD/asciidoc.css -a max-width=800px -o cic-tutorial.html README
-
+asciidoc -a "src_tab=3 --line-number=' '" -a toc -a toplevels=2 -a stylesheet=$PWD/asciidoc.css -a max-width=800px -o cic-tutorial.html README
More information about the Swift-commit
mailing list