[Swift-commit] r5664 - trunk/docs/siteguide
wozniak at ci.uchicago.edu
wozniak at ci.uchicago.edu
Wed Feb 22 12:16:32 CST 2012
Author: wozniak
Date: 2012-02-22 12:16:32 -0600 (Wed, 22 Feb 2012)
New Revision: 5664
Added:
trunk/docs/siteguide/overview
Modified:
trunk/docs/siteguide/beagle
trunk/docs/siteguide/fusion
trunk/docs/siteguide/futuregrid
trunk/docs/siteguide/grid
trunk/docs/siteguide/intrepid
trunk/docs/siteguide/mcs
trunk/docs/siteguide/pads
trunk/docs/siteguide/siteguide.txt
Log:
Site Configuration Guide updates: System types in headers; extra BG/P notes
Modified: trunk/docs/siteguide/beagle
===================================================================
--- trunk/docs/siteguide/beagle 2012-02-22 15:33:40 UTC (rev 5663)
+++ trunk/docs/siteguide/beagle 2012-02-22 18:16:32 UTC (rev 5664)
@@ -1,15 +1,16 @@
-Beagle
-------
-Beagle is a Cray XE6 supercomputer at UChicago. It employs a batch-oriented
-computational model where-in a PBS schedular accepts user's jobs and queues
-them in the queueing system for execution. The computational model requires
-a user to prepare the submit files, track job submissions, chackpointing,
-managing input/output data and handling exceptional conditions manually.
-Running Swift under Beagle can accomplish the above tasks with least manual
-user intervention and maximal oppurtunistic computation time on Beagle
-queues. In the following sections, we discuss more about specifics of
-running Swift on Beagle. A more detailed information about Swift and its
-workings can be found on Swift documentation page here:
+Cray XE6: Beagle
+----------------
+
+Beagle is a Cray XE6 supercomputer at UChicago. It employs a batch-oriented
+computational model where-in a PBS schedular accepts user's jobs and queues
+them in the queueing system for execution. The computational model requires
+a user to prepare the submit files, track job submissions, chackpointing,
+managing input/output data and handling exceptional conditions manually.
+Running Swift under Beagle can accomplish the above tasks with least manual
+user intervention and maximal oppurtunistic computation time on Beagle
+queues. In the following sections, we discuss more about specifics of
+running Swift on Beagle. A more detailed information about Swift and its
+workings can be found on Swift documentation page here:
http://www.ci.uchicago.edu/swift/wwwdev/docs/index.php
More information on Beagle can be found on UChicago Beagle website here:
http://beagle.ci.uchicago.edu
@@ -18,7 +19,7 @@
~~~~~~~~~~~~~~~~~
If you do not already have a Computation Institute account, you can request
one at https://www.ci.uchicago.edu/accounts/. This page will give you a list
-of resources you can request access to.
+of resources you can request access to.
You already have an existing CI account, but do not have access to Beagle,
send an email to support at ci.uchicago.edu to request access.
@@ -138,7 +139,7 @@
* *maxTime* : The expected walltime for completion of your run. This parameter is accepted in seconds.
* *slots* : This parameter specifies the maximum number of pbs jobs/blocks that the coaster scheduler will have running at any given time. On Beagle, this number will determine how many qsubs swift will submit for your run. Typical values range between 40 and 60 for large runs.
- * *nodeGranularity* : Determines the number of nodes per job. It restricts the number of nodes in a job to a multiple of this value. The total number of workers will then be a multiple of jobsPerNode * nodeGranularity. For Beagle, jobsPerNode value is 24 corresponding to its 24 cores per node.
+ * *nodeGranularity* : Determines the number of nodes per job. It restricts the number of nodes in a job to a multiple of this value. The total number of workers will then be a multiple of jobsPerNode * nodeGranularity. For Beagle, jobsPerNode value is 24 corresponding to its 24 cores per node.
* *maxNodes* : Determines the maximum number of nodes a job must pack into its qsub. This parameter determines the largest single job that your run will submit.
* *jobThrottle* : A factor that determines the number of tasks dispatched simultaneously. The intended number of simultaneous tasks must match the number of cores targeted. The number of tasks is calculated from the jobThrottle factor is as follows:
@@ -155,7 +156,7 @@
<profile namespace="globus" key="project">CI-CCR000013</profile>
<profile namespace="globus" key="ppn">24:cray:pack</profile>
-
+
<!-- For swift 0.93
<profile namespace="globus" key="ppn">pbs.aprun;pbs.mpp;depth=24</profile>
-->
@@ -196,7 +197,7 @@
-----
* Failed to transfer wrapperlog for job cat-nmobtbkk and/or Job failed with an exit code of 254. Check the <workdirectory> element on the sites.xml file.
-
+
-----
<workdirectory >/home/ketan/swift.workdir</workdirectory>
-----
Modified: trunk/docs/siteguide/fusion
===================================================================
--- trunk/docs/siteguide/fusion 2012-02-22 15:33:40 UTC (rev 5663)
+++ trunk/docs/siteguide/fusion 2012-02-22 18:16:32 UTC (rev 5664)
@@ -1,37 +1,38 @@
-Fusion
-------
-Fusion is a 320-node computing cluster for the Argonne
-National Laboratory community. The primary goal of the LCRC is to
-facilitate mid-range computing in all of the scientific programs of
+x86 Cluster: Fusion
+-------------------
+
+Fusion is a 320-node computing cluster for the Argonne
+National Laboratory community. The primary goal of the LCRC is to
+facilitate mid-range computing in all of the scientific programs of
Argonne and the University of Chicago.
This section will walk you through running a simple Swift script
-on Fusion.
+on Fusion.
Requesting Access
~~~~~~~~~~~~~~~~~
-If you do not already have a Fusion account, you can request one at
+If you do not already have a Fusion account, you can request one at
https://accounts.lcrc.anl.gov/request.php. Email support at lcrc.anl.gov
for additional help.
Projects
~~~~~~~~
-In order to run a job on a Fusion compute node, you must first be associated
+In order to run a job on a Fusion compute node, you must first be associated
with a project.
Each project has one or more Primary Investigators, or PIs. These PIs are
responsible for adding and removing users to a project. Contact the PI of
your project to be added.
-More information on this process can be found at
+More information on this process can be found at
http://www.lcrc.anl.gov/info/Projects.
SSH Keys
~~~~~~~~
Before accessing Fusion, be sure to have your SSH keys configured correctly.
-SSH keys are required to access fusion. You should see information about
+SSH keys are required to access fusion. You should see information about
this when you request your account. Email support at lcrc.anl.gov for
-additional help.
+additional help.
Connecting to a login node
~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -48,7 +49,7 @@
run. This section will provide a working configuration file which
you can copy and paste to get running quickly. The sites.xml file
tells Swift how to submit jobs, where working directories are
-located, and various other configuration information. More
+located, and various other configuration information. More
information on sites.xml can be found in the Swift User's Guide.
The first step is to paste the text below into a file named sites.xml.
@@ -57,7 +58,7 @@
include::../../tests/providers/fusion/coasters/sites.template.xml[]
-----
-This file will require one customization. Create a
+This file will require one customization. Create a
directory called swiftwork. Modify \_WORK_ in sites.xml
to point to this new directory. For example
-----
@@ -97,15 +98,15 @@
-----
You should see 10 new text files get created, named catsn*.out. If
-you see these files, then you have succesfully run Swift on Fusion!
+you see these files, then you have succesfully run Swift on Fusion!
Queues
~~~~~~
-Fusion has two queues: shared and batch. The shared queue has a maximum 1
-hour walltime and limited to 4 nodes. The batch queue is for all other
+Fusion has two queues: shared and batch. The shared queue has a maximum 1
+hour walltime and limited to 4 nodes. The batch queue is for all other
jobs.
-Edit your sites.xml file and edit the queue option to modify Swift's
+Edit your sites.xml file and edit the queue option to modify Swift's
behavior. For example:
-----
Modified: trunk/docs/siteguide/futuregrid
===================================================================
--- trunk/docs/siteguide/futuregrid 2012-02-22 15:33:40 UTC (rev 5663)
+++ trunk/docs/siteguide/futuregrid 2012-02-22 18:16:32 UTC (rev 5664)
@@ -1,7 +1,8 @@
-Futuregrid Quickstart Guide
----------------------------
-FutureGrid is a distributed, high-performance test-bed that allows
-scientists to collaboratively develop and test innovative approaches
+x86 Cloud: Futuregrid Quickstart Guide
+--------------------------------------
+
+FutureGrid is a distributed, high-performance test-bed that allows
+scientists to collaboratively develop and test innovative approaches
to parallel, grid, and cloud computing.
More information on futuregrid can be found at https://portal.futuregrid.org/.
@@ -16,7 +17,7 @@
Downloading Swift VM Tools
~~~~~~~~~~~~~~~~~~~~~~~~~~
A set of scripts based around cloudinitd are used to easily start virtual
-machines. To download, change to your home directory and run the
+machines. To download, change to your home directory and run the
following command:
-----
@@ -38,9 +39,9 @@
Configuring coaster-service.conf
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-To run on futuregrid, you will need a file called coaster-service.conf.
+To run on futuregrid, you will need a file called coaster-service.conf.
This file contains many options to control how things run. Here is
-an example of a working coaster-service.conf on futuregrid.
+an example of a working coaster-service.conf on futuregrid.
-----
# Where to copy worker.pl on the remote machine for sites.xml
@@ -95,7 +96,7 @@
This command will start the VMs, start the required processes on the worker nodes,
and generate Swift configuration files for you to use. The configuration files
-will be generated in your current directory. These files are sites.xml, tc.data,
+will be generated in your current directory. These files are sites.xml, tc.data,
and cf.
Running Swift
@@ -108,10 +109,10 @@
If you would like to create a custom tc file for repeated use, rename it to something other
than tc.data to prevent it from being overwritten. The sites.xml however will need to be
-regenerated every time you start the coaster service. If you need to repeatedly modify some
+regenerated every time you start the coaster service. If you need to repeatedly modify some
sites.xml options, you may edit the template in Swift's etc/sites/persistent-coasters. You
may also create your own custom tc files with the hostname of persistent-coasters. More
-information about this can be found in the Swift userguide at
+information about this can be found in the Swift userguide at
http://www.ci.uchicago.edu/swift/guides/trunk/userguide/userguide.html.
Stopping the Coaster Service Script
@@ -127,6 +128,6 @@
More Help
~~~~~~~~~
The best place for additional help is the Swift user mailing list. You can subscribe to this list at
-http://mail.ci.uchicago.edu/mailman/listinfo/swift-user. When submitting information, please send
+http://mail.ci.uchicago.edu/mailman/listinfo/swift-user. When submitting information, please send
your sites.xml file, your tc.data, and any error messages you run into.
Modified: trunk/docs/siteguide/grid
===================================================================
--- trunk/docs/siteguide/grid 2012-02-22 15:33:40 UTC (rev 5663)
+++ trunk/docs/siteguide/grid 2012-02-22 18:16:32 UTC (rev 5664)
@@ -1,5 +1,5 @@
-Grids, including OSG and TeraGrid
----------------------------------
+Grids: Open Science Grid and TeraGrid
+-------------------------------------
Overview of running on grid sites
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -54,7 +54,7 @@
*Step2.* When you receive your certificate via a link by mail, download and
install it in your browser; we have tested it for firefox on linux and mac.,
-and for Chrome on mac.
+and for Chrome on mac.
On firefox, as you click the link that you received in the mail, you will be
prompted to install it by firefox: passphrase it and click install. Next take a
@@ -77,7 +77,7 @@
.pem file for key and cert. For this conversion use the above backed up .p12 file as follows:
----
-$ openssl pkcs12 -in your.p12 -out usercert.pem -nodes -clcerts -nokeys
+$ openssl pkcs12 -in your.p12 -out usercert.pem -nodes -clcerts -nokeys
$ openssl pkcs12 -in your.p12 -out userkey.pem -nodes -nocerts
----
@@ -143,7 +143,7 @@
source /opt/osg-<version>/setup.csh
-----
-NOTE: This above step is not required on engage-submit3 host.
+NOTE: This above step is not required on engage-submit3 host.
Create a VOMS Grid proxy
~~~~~~~~~~~~~~~~~~~~~~~~
@@ -180,7 +180,7 @@
-----
$ ./foreachsite -help
./foreachsite [-resource fork|worker ] [-sites alt-sites-file] scriptname
-$
+$
-----
To install your software, create a script similar to "myapp.sh",
@@ -307,7 +307,7 @@
-----
start-ranger-service --nodes 1 --walltime 00:10:00 --project TG-DBS123456N \
--queue development --user tg12345 --startservice no \
- >& start-ranger-service.out
+ >& start-ranger-service.out
-----
NOTE: Change the project and user names to match your TeraGrid
Modified: trunk/docs/siteguide/intrepid
===================================================================
--- trunk/docs/siteguide/intrepid 2012-02-22 15:33:40 UTC (rev 5663)
+++ trunk/docs/siteguide/intrepid 2012-02-22 18:16:32 UTC (rev 5664)
@@ -1,21 +1,23 @@
-Intrepid
---------
-Intrepid is an IBM Blue Gene/P supercomputer located at the Argonne Leadership
-Computing Facility. More information on Intrepid can be found at
-http://www.alcf.anl.gov/.
+Blue Gene/P: Intrepid
+---------------------
+Intrepid is an IBM Blue Gene/P supercomputer located at the Argonne
+Leadership Computing Facility. More information on Intrepid can be
+found at http://www.alcf.anl.gov. Surveyor and Challenger are
+similar, smaller machines.
+
Requesting Access
~~~~~~~~~~~~~~~~~
If you do not already have an account on Intrepid, you can request
-one at https://accounts.alcf.anl.gov/accounts/request.php. More information about
-this process and requesting allocations for your project can be found at
-http://www.alcf.anl.gov/support/gettingstarted/index.php.
+one at https://accounts.alcf.anl.gov/accounts/request.php. More information about
+this process and requesting allocations for your project can be found at
+http://www.alcf.anl.gov/support/gettingstarted/index.php.
SSH Keys
~~~~~~~~
-Accessing the Intrepid via SSH can be done with any SSH software package.
-Before logging in, you will need to generate an SSH public key and send it to
-support at alcf.anl.gov for verification and installation.
+Accessing the Intrepid via SSH can be done with any SSH software package.
+Before logging in, you will need to generate an SSH public key and send it to
+support at alcf.anl.gov for verification and installation.
Cryptocard
~~~~~~~~~~
@@ -39,14 +41,14 @@
Downloading and building Swift
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-The most recent versions of Swift can be found at
+The most recent versions of Swift can be found at
http://www.ci.uchicago.edu/swift/downloads/index.php. Follow the instructions
provided on that site to download and build Swift.
Adding Swift to your PATH
~~~~~~~~~~~~~~~~~~~~~~~~~
Once you have installed Swift, add the Swift binary to your PATH so you can
-easily run it from any directory.
+easily run it from any directory.
In your home directory, edit the file ".bashrc".
@@ -92,7 +94,7 @@
-----
If you are not a member of a project, you must first request access
-to a project. More information on this process can be found at
+to a project. More information on this process can be found at
https://wiki.alcf.anl.gov/index.php/Discretionary_Allocations
Determine your Queue
@@ -117,22 +119,40 @@
Generating Configuration Files
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-Now that you know what queue to use, your project, and your work directory, it is time to
-set up Swift. Swift uses a configuration file called sites.xml to determine how it should run.
-There are two methods you can use for creating this file. You can manually edit
-the configuration file, or generate it with a utility called gensites.
+Now that you know what queue to use, your project, and your work
+directory, it is time to set up Swift. Swift uses a configuration file
+called sites.xml to determine how it should run. There are two
+methods you can use for creating this file. You can manually edit the
+configuration file, or generate it with a utility called +gensites+.
+
Manually Editing sites.xml
^^^^^^^^^^^^^^^^^^^^^^^^^^
-Below is the template that is used by Swift's test suite for running on Intrepid.
-TODO: Update the rest below here
+Below is the template that is used by Swift's test suite for running
+on Intrepid.
+
-----
include::../../tests/providers/intrepid/sites.template.xml[]
-----
-The values to note here are the ones that are listed between underscores. In the example above, they are \_QUEUE_, and \_WORK_. Queue is the PADS queue to use and WORK is the swift work directory. These are placeholder values you will need to modify to fit your needs. Copy and paste this template, replace the values, and call it sites.xml.
+Copy and paste this template, replace the values, and call it
++sites.xml+.
+The values to note here are the ones that are listed between
+underscores. In the example above, they are +\_HOST_+, +\_PROJECT_+,
++\_QUEUE_+, and +\_WORK_+.
+
++HOST+:: The IP address on which Swift runs and to which workers must
+connect. To obtain this, run +ifconfig+ and select the IP address
+that starts with +172+.
+
++PROJECT+:: The project to use.
+
++QUEUE+:: The queue to use.
+
++WORK+:: The Swift work directory.
+
Manually Editing tc.data
~~~~~~~~~~~~~~~~~~~~~~~~
Below is the tc.data file used by Swift's test suite for running on PADS.
Modified: trunk/docs/siteguide/mcs
===================================================================
--- trunk/docs/siteguide/mcs 2012-02-22 15:33:40 UTC (rev 5663)
+++ trunk/docs/siteguide/mcs 2012-02-22 18:16:32 UTC (rev 5664)
@@ -1,11 +1,12 @@
-MCS Workstations
-----------------
-This sections describes how to use the general use compute servers for
+x86 Workstations: MCS Compute Servers
+-------------------------------------
+
+This sections describes how to use the general use compute servers for
the MCS division of Argonne National Laboratory.
Create a coaster-service.conf
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-To begin, copy the text below and paste it into your Swift distribution's etc
+To begin, copy the text below and paste it into your Swift distribution's etc
directory. Name the file coaster-service.conf.
-----
@@ -15,7 +16,7 @@
Starting the Coaster Service
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-Change directories to the location you would like to run a
+Change directories to the location you would like to run a
Swift script and start the coaster service with this
command:
@@ -27,7 +28,7 @@
called sites.xml.
WARNING: Any existing sites.xml files in this directory
-will be overwritten. Be sure to make a copy of any
+will be overwritten. Be sure to make a copy of any
custom configuration files you may have.
Run Swift
Added: trunk/docs/siteguide/overview
===================================================================
--- trunk/docs/siteguide/overview (rev 0)
+++ trunk/docs/siteguide/overview 2012-02-22 18:16:32 UTC (rev 5664)
@@ -0,0 +1,8 @@
+Overview
+--------
+
+This guide explains details required to run Swift on various system
+types, with details for specific installations on which Swift is
+currently used. For a given system type, most instructions should
+work on that system type anywhere. However, details such as queue
+names or file system locations will have to be customized by the user.
Modified: trunk/docs/siteguide/pads
===================================================================
--- trunk/docs/siteguide/pads 2012-02-22 15:33:40 UTC (rev 5663)
+++ trunk/docs/siteguide/pads 2012-02-22 18:16:32 UTC (rev 5664)
@@ -1,7 +1,8 @@
-PADS
-----
-PADS is a petabyte-scale, data intense computing resource located
-at the joint Argonne National Laboratory/University of Chicago
+x86 Cluster: PADS
+-----------------
+
+PADS is a petabyte-scale, data intense computing resource located
+at the joint Argonne National Laboratory/University of Chicago
Computation Institute. More information about PADS can be found
at http://pads.ci.uchicago.edu.
@@ -33,7 +34,7 @@
Adding Software Packages
~~~~~~~~~~~~~~~~~~~~~~~~
Softenv is a system used for managing applications. In order to run Swift,
-the softenv environment will have to be modified slightly. Softenv is
+the softenv environment will have to be modified slightly. Softenv is
configured by a file in your home directory called .soft. Edit this file
to look like this:
-----
@@ -82,7 +83,7 @@
run. This section will provide a working configuration file which
you can copy and paste to get running quickly. The sites.xml file
tells Swift how to submit jobs, where working directories are
-located, and various other configuration information. More
+located, and various other configuration information. More
information on sites.xml can be found in the Swift User's Guide.
The first step is to paste the text below into a file named sites.xml.
@@ -91,7 +92,7 @@
include::../../tests/providers/pads/coasters/sites.template.xml[]
-----
-This file will require just a few customizations. First, create a
+This file will require just a few customizations. First, create a
directory called swiftwork. Modify \_WORK_ in sites.xml
to point to this new directory. For example
-----
@@ -121,22 +122,27 @@
$ cp ~/swift-0.93/examples/misc/catsn.swift .
$ cp ~/swift-0.93/examples/misc/data.txt .
-----
-TIP: The location of your swift directory may vary depending on how you installed it. Change this to the examples/misc directory of your installation as needed.
+TIP: The location of your swift directory may vary depending on how
+you installed it. Change this to the examples/misc directory of your
+installation as needed.
+
Run Swift
^^^^^^^^^
-Finally, run the script
+Finally, run the script:
+
-----
-$ swift -sites.file sites.xml -tc.file tc.data catsn.swift
+ swift -sites.file sites.xml -tc.file tc.data catsn.swift
-----
You should see several new files being created, called catsn.0001.out, catsn.0002.out, etc. Each of these
files should contain the contents of what you placed into data.txt. If this happens, your job has run
successfully on PADS!
-TIP: Make sure your default project is defined. Read on for more information.
+TIP: Make sure your default project is defined. Read on for more
+information.
Read on for more detailed information about running Swift on PADS.
@@ -145,7 +151,7 @@
^^^^^^
As you run more application in the future, you will likely need
-to change queues.
+to change queues.
PADS has several different queues you can submit jobs to depending on
the type of work you will be doing. The command "qstat -q" will print
Modified: trunk/docs/siteguide/siteguide.txt
===================================================================
--- trunk/docs/siteguide/siteguide.txt 2012-02-22 15:33:40 UTC (rev 5663)
+++ trunk/docs/siteguide/siteguide.txt 2012-02-22 18:16:32 UTC (rev 5664)
@@ -6,6 +6,8 @@
:website: http://www.ci.uchicago.edu/swift/guides/siteguide.php
:numbered:
+include::overview[]
+
include::prereqs[]
include::pads[]
More information about the Swift-commit
mailing list