[Swift-commit] r6551 - SwiftTutorials/OHBM_2013-06-16

wilde at ci.uchicago.edu wilde at ci.uchicago.edu
Fri Jun 14 08:52:12 CDT 2013


Author: wilde
Date: 2013-06-14 08:52:11 -0500 (Fri, 14 Jun 2013)
New Revision: 6551

Added:
   SwiftTutorials/OHBM_2013-06-16/apps.beagle-scp
   SwiftTutorials/OHBM_2013-06-16/auth.defaults.example
   SwiftTutorials/OHBM_2013-06-16/setup.sh
   SwiftTutorials/OHBM_2013-06-16/swift.properties.ps
Modified:
   SwiftTutorials/OHBM_2013-06-16/README
   SwiftTutorials/OHBM_2013-06-16/TODO
   SwiftTutorials/OHBM_2013-06-16/genatlas.swift
   SwiftTutorials/OHBM_2013-06-16/sites.xml
   SwiftTutorials/OHBM_2013-06-16/swift.properties
Log:
Demo updates; includes provider staging config for midway-to-beagle.

Modified: SwiftTutorials/OHBM_2013-06-16/README
===================================================================
--- SwiftTutorials/OHBM_2013-06-16/README	2013-06-14 04:07:57 UTC (rev 6550)
+++ SwiftTutorials/OHBM_2013-06-16/README	2013-06-14 13:52:11 UTC (rev 6551)
@@ -1,7 +1,7 @@
 
 fMRI Data Processing demo for OHBM 2013
 
-Setup
+* Setup
 
   # Get the scripts from svn
 
@@ -11,37 +11,59 @@
   # Set default swift.properties in $HOME/.swift.  Points to "." for apps (tc) and sites.xml
 
   mkdir -p ~/.swift
-  cp ~/.swift/swift.properties ~/.swift/save.swift.properties # If needed, as needed.
-  cp swift.properties ~/.swift
+  cp ~/.swift/swift.properties ~/.swift/swift.properties.save # Backup yours, if needed
 
+  cp swift.properties    ~/.swift   # for ssh staging      (see
+  cp swift.propertes.ps ~/.swift    # for provider staging  below)
+
+  # for scp staging, set your auth.defaults:
+
+  mv auth.defaults.example auth.defaults  # AND EDIT to set your login and ssh key
+  cp $HOME/.ssh/auth.defaults $HOME/.ssh/auth.defaults.save # Backup as needed
+  cp auth.defaults $HOME/.ssh/auth.defaults
+
   # Get swift
 
-  module load swift
+  module load swift  # or set PATH as below
 
-To generate test data directories:
+  # NOTE: CURRENT TESTING SHOULD USE THIS SWIFT: (with the latest provider-staging fixes)
 
-  ./makedata data_100 100 # create 100 anatomical image volumes in directory data_100
+  PATH=/project/wilde/swift/src/0.94/cog/modules/swift/dist/swift-svn/bin:$PATH
 
-To run:
+  # Run setup: sets env var(s) and ensures java is loaded    # <== DONT FORGET !!!
 
- # On localhost:
+  source setup.sh
 
- swift genatlas.swift             # processes data/ directory by default
- swift genatlas.swift -d=data_100 # process data_100/ directory
+* Generate test data directories (example):
 
- # With most parallel work on midway westmere parition:
+  ./makedata data_100 100   # creates 100 anatomical image volumes in directory data_100
 
- swift -tc.file apps.midway genatlas.swift
+The generated data consists of links to a single file, for ease of setup and demo.
 
- # Choices for -tc.file are:
 
- apps         # default, runs on localhost
- apps.beagle  # on beagle, 8 nodes
- apps.midway  # on midway westmere, 1 node
- apps.amazon  # on Amazon EC2 - needs start-coaster-service, see below
- apps.UC3     # submits to UC3, needs apps to be sent.
+* To run:
 
+  # On localhost:
 
+  swift genatlas.swift             # processes data/ directory by default
+  swift genatlas.swift -d=data_100 # process data_100/ directory
+
+  # With most parallel work on midway westmere parition:
+
+  swift -tc.file apps.midway genatlas.swift
+
+  # From midway to beagle using provider staging:
+
+   swift -config swift.properties.ps -tc.file apps.beagle genatlas.swift -d=data_100
+
+  # Choices for -tc.file are:
+
+  apps         # default, runs on localhost
+  apps.beagle  # on beagle, 8 nodes
+  apps.midway  # on midway westmere, 1 node
+  apps.amazon  # on Amazon EC2 - needs start-coaster-service, see below
+  apps.UC3     # submits to UC3, needs apps to be sent.
+
 The output files of the workflow are placed under output/,
 intermediate files under work/.
 
@@ -52,18 +74,10 @@
   http://www.ci.uchicago.edu/~wilde/atlas-y.png
   http://www.ci.uchicago.edu/~wilde/atlas-z.png
 
-Notes:
+* Notes:
 
   The "aligin" initial stage runs 3 AIR tools under as a single shell
   script, to reduce staging between these steps.
 
-TODO
 
-  Show the workflow with the 3 AIR tools expanded as separate workflow
-  steps, and as a Swift procedure.
 
-  Run on other sites and Multisite
-
-  Add performance monitoring and plotting
-
-

Modified: SwiftTutorials/OHBM_2013-06-16/TODO
===================================================================
--- SwiftTutorials/OHBM_2013-06-16/TODO	2013-06-14 04:07:57 UTC (rev 6550)
+++ SwiftTutorials/OHBM_2013-06-16/TODO	2013-06-14 13:52:11 UTC (rev 6551)
@@ -2,4 +2,17 @@
 
 /scratch/midway/wilde/ds107/ds107/sub001/model/model001
 
+Show nested studies
 
+
+Show the workflow with the 3 AIR tools (align.sh) expanded as separate workflow
+steps, and as a Swift procedure.
+
+Run on other sites and Multisite
+
+Add performance monitoring and plotting
+
+
+---
+
+

Added: SwiftTutorials/OHBM_2013-06-16/apps.beagle-scp
===================================================================
--- SwiftTutorials/OHBM_2013-06-16/apps.beagle-scp	                        (rev 0)
+++ SwiftTutorials/OHBM_2013-06-16/apps.beagle-scp	2013-06-14 13:52:11 UTC (rev 6551)
@@ -0,0 +1,4 @@
+beagle-scp sh       /bin/sh null null env::AIR5=/lustre/beagle/wilde/software/AIR5.3.0
+localhost  softmean /home/wilde/software/AIR5.3.0/bin/softmean
+localhost  slicer   /project/wilde/software/fsl-5.0.4/fsl/bin/slicer null null env::FSLOUTPUTTYPE=NIFTI
+localhost  convert  convert

Added: SwiftTutorials/OHBM_2013-06-16/auth.defaults.example
===================================================================
--- SwiftTutorials/OHBM_2013-06-16/auth.defaults.example	                        (rev 0)
+++ SwiftTutorials/OHBM_2013-06-16/auth.defaults.example	2013-06-14 13:52:11 UTC (rev 6551)
@@ -0,0 +1,4 @@
+login.beagle.ci.uchicago.edu.type=key
+login.beagle.ci.uchicago.edu.username=wilde
+login.beagle.ci.uchicago.edu.key=/home/wilde/.ssh/id_rsa-swift
+login.beagle.ci.uchicago.edu.PROMPT_FOR_passphrase=this is the key

Modified: SwiftTutorials/OHBM_2013-06-16/genatlas.swift
===================================================================
--- SwiftTutorials/OHBM_2013-06-16/genatlas.swift	2013-06-14 04:07:57 UTC (rev 6550)
+++ SwiftTutorials/OHBM_2013-06-16/genatlas.swift	2013-06-14 13:52:11 UTC (rev 6551)
@@ -1,10 +1,14 @@
 type file;
 
+# import fMRIdefs
+
 type Volume {
   file header;
   file image;
 };
 
+# import AIRdefs
+
 app (Volume alignedVol) align (file script, Volume reference, Volume input)
 {
   sh @script @filename(reference.image) @filename(input.image) @filename(alignedVol.image);
@@ -25,6 +29,8 @@
   convert @i @o;
 }
 
+# Start code here
+
 string dataDir = @arg("d","data");
 file   alignScript<"align.sh">;
 

Added: SwiftTutorials/OHBM_2013-06-16/setup.sh
===================================================================
--- SwiftTutorials/OHBM_2013-06-16/setup.sh	                        (rev 0)
+++ SwiftTutorials/OHBM_2013-06-16/setup.sh	2013-06-14 13:52:11 UTC (rev 6551)
@@ -0,0 +1,7 @@
+export GLOBUS_HOSTNAME=swift.rcc.uchicago.edu
+module load java
+
+if [ $(hostname) != midway001 ]; then
+  echo ERROR: this neesd to run from swift.rcc.uchicago.edu
+  return
+fi

Modified: SwiftTutorials/OHBM_2013-06-16/sites.xml
===================================================================
--- SwiftTutorials/OHBM_2013-06-16/sites.xml	2013-06-14 04:07:57 UTC (rev 6550)
+++ SwiftTutorials/OHBM_2013-06-16/sites.xml	2013-06-14 13:52:11 UTC (rev 6551)
@@ -45,13 +45,13 @@
     <profile namespace="globus" key="lowOverAllocation">100</profile>
     <profile namespace="globus" key="highOverAllocation">100</profile>
     <profile namespace="globus" key="providerAttributes">pbs.aprun;pbs.mpp;depth=24</profile>
-    <!-- to use a beage reservation, eg:
-         <profile namespace="globus" key="providerAttributes">pbs.aprun;pbs.mpp;depth=24;pbs.resource_list=advres=wilde.1768</profile>
+    <!-- to use a beage reservation, modify above tag, eg:
+    <... key="providerAttributes">pbs.aprun;pbs.mpp;depth=24;pbs.resource_list=advres=wilde.1768</profile>
     -->
     <profile namespace="globus" key="maxtime">3600</profile>
     <profile namespace="globus" key="maxWalltime">00:05:00</profile>
     <profile namespace="globus" key="userHomeOverride">/lustre/beagle/{env.USER}/swiftwork</profile>
-    <profile namespace="globus" key="slots">20</profile>
+    <profile namespace="globus" key="slots">8</profile>
     <profile namespace="globus" key="maxnodes">1</profile>
     <profile namespace="globus" key="nodeGranularity">1</profile>
     <profile namespace="karajan" key="jobThrottle">4.80</profile>
@@ -59,4 +59,36 @@
     <workdirectory>/tmp/{env.USER}/swiftwork</workdirectory>
   </pool>
 
+  <pool handle="beagle-scp">
+    <execution provider="coaster" jobmanager="ssh-cl:pbs" url="login.beagle.ci.uchicago.edu"/>
+    <profile namespace="globus" key="jobsPerNode">24</profile>
+    <profile namespace="globus" key="lowOverAllocation">100</profile>
+    <profile namespace="globus" key="highOverAllocation">100</profile>
+    <profile namespace="globus" key="providerAttributes">pbs.aprun;pbs.mpp;depth=24</profile>
+    <!-- to use a beage reservation, modify above tag, eg:
+    <... key="providerAttributes">pbs.aprun;pbs.mpp;depth=24;pbs.resource_list=advres=wilde.1768</profile>
+    -->
+    <profile namespace="globus" key="maxtime">3600</profile>
+    <profile namespace="globus" key="maxWalltime">00:05:00</profile>
+    <profile namespace="globus" key="userHomeOverride">/lustre/beagle/{env.USER}/swiftwork</profile>
+    <profile namespace="globus" key="slots">4</profile>
+    <profile namespace="globus" key="maxnodes">1</profile>
+    <profile namespace="globus" key="nodeGranularity">1</profile>
+    <profile namespace="karajan" key="jobThrottle">1.00</profile>
+    <profile namespace="karajan" key="initialScore">10000</profile>
+
+    <filesystem provider="ssh" url="login.beagle.ci.uchicago.edu"/>
+    <workdirectory>/lustre/beagle/{env.USER}/swiftwork</workdirectory>
+  </pool>
+
+  <pool handle="beagle-coast-temp">
+    <execution provider="coaster" jobmanager="ssh:local" url="login.beagle.ci.uchicago.edu"/>
+    <profile namespace="karajan" key="jobThrottle">0</profile>
+    <profile namespace="karajan" key="initialScore">10000</profile>
+
+    <filesystem provider="ssh" url="login.beagle.ci.uchicago.edu"/>
+    <workdirectory>/lustre/beagle/wilde/swiftwork</workdirectory>
+  </pool>
+
+
 </config>

Modified: SwiftTutorials/OHBM_2013-06-16/swift.properties
===================================================================
--- SwiftTutorials/OHBM_2013-06-16/swift.properties	2013-06-14 04:07:57 UTC (rev 6550)
+++ SwiftTutorials/OHBM_2013-06-16/swift.properties	2013-06-14 13:52:11 UTC (rev 6551)
@@ -2,13 +2,94 @@
 sites.file=sites.xml
 tc.file=apps
 
+use.provider.staging=false
+provider.staging.pin.swiftfiles=true
+
+use.wrapper.staging=false
 status.mode=provider
-use.provider.staging=true
-use.wrapper.staging=false
 wrapperlog.always.transfer=true
-execution.retries=0
-lazy.errors=false
-provider.staging.pin.swiftfiles=true
+execution.retries=3
+lazy.errors=true
 sitedir.keep=true
 file.gc.enabled=false
 #tcp.port.range=50000,51000
+
+###########################################################################
+#                          Throttling options                             #
+###########################################################################
+#
+# For the throttling parameters, valid values are either a positive integer
+# or "off" (without the quotes).
+#
+
+#
+# Limits the number of concurrent submissions for a workflow instance. This
+# throttle only limits the number of concurrent tasks (jobs) that are being
+# sent to sites, not the total number of concurrent jobs that can be run.
+# The submission stage in GRAM is one of the most CPU expensive stages (due
+# mostly to the mutual authentication and delegation). Having too many
+# concurrent submissions can overload either or both the submit host CPU
+# and the remote host/head node causing degraded performance.
+#
+# Default: 4
+#
+
+throttle.submit=4
+#throttle.submit=off
+
+#
+# Limits the number of concurrent submissions for any of the sites Swift will
+# try to send jobs to. In other words it guarantees that no more than the
+# value of this throttle jobs sent to any site will be concurrently in a state
+# of being submitted.
+#
+# Default: 2
+#
+
+### throttle.host.submit=2
+#throttle.host.submit=off
+
+#
+# The Swift scheduler has the ability to limit the number of concurrent jobs
+# allowed on a site based on the performance history of that site. Each site
+# is assigned a score (initially 1), which can increase or decrease based
+# on whether the site yields successful or faulty job runs. The score for a
+# site can take values in the (0.1, 100) interval. The number of allowed jobs
+# is calculated using the following formula:
+# 	2 + score*throttle.score.job.factor
+# This means a site will always be allowed at least two concurrent jobs and
+# at most 2 + 100*throttle.score.job.factor. With a default of 4 this means
+# at least 2 jobs and at most 402.
+#
+# Default: 4
+#
+
+### throttle.score.job.factor=0.2
+#throttle.score.job.factor=off
+
+
+#
+# Limits the total number of concurrent file transfers that can happen at any
+# given time. File transfers consume bandwidth. Too many concurrent transfers
+# can cause the network to be overloaded preventing various other signalling
+# traffic from flowing properly.
+#
+# Default: 4
+#
+
+throttle.transfers=1
+#throttle.transfers=off
+
+# Limits the total number of concurrent file operations that can happen at any
+# given time. File operations (like transfers) require an exclusive connection
+# to a site. These connections can be expensive to establish. A large number
+# of concurrent file operations may cause Swift to attempt to establish many
+# such expensive connections to various sites. Limiting the number of concurrent
+# file operations causes Swift to use a small number of cached connections and
+# achieve better overall performance.
+#
+# Default: 8
+#
+
+throttle.file.operations=1
+#throttle.file.operations=off

Added: SwiftTutorials/OHBM_2013-06-16/swift.properties.ps
===================================================================
--- SwiftTutorials/OHBM_2013-06-16/swift.properties.ps	                        (rev 0)
+++ SwiftTutorials/OHBM_2013-06-16/swift.properties.ps	2013-06-14 13:52:11 UTC (rev 6551)
@@ -0,0 +1,95 @@
+
+sites.file=sites.xml
+tc.file=apps
+
+use.provider.staging=true
+provider.staging.pin.swiftfiles=true
+
+use.wrapper.staging=false
+status.mode=provider
+wrapperlog.always.transfer=true
+execution.retries=3
+lazy.errors=true
+sitedir.keep=true
+file.gc.enabled=false
+#tcp.port.range=50000,51000
+
+###########################################################################
+#                          Throttling options                             #
+###########################################################################
+#
+# For the throttling parameters, valid values are either a positive integer
+# or "off" (without the quotes).
+#
+
+#
+# Limits the number of concurrent submissions for a workflow instance. This
+# throttle only limits the number of concurrent tasks (jobs) that are being
+# sent to sites, not the total number of concurrent jobs that can be run.
+# The submission stage in GRAM is one of the most CPU expensive stages (due
+# mostly to the mutual authentication and delegation). Having too many
+# concurrent submissions can overload either or both the submit host CPU
+# and the remote host/head node causing degraded performance.
+#
+# Default: 4
+#
+
+throttle.submit=4
+#throttle.submit=off
+
+#
+# Limits the number of concurrent submissions for any of the sites Swift will
+# try to send jobs to. In other words it guarantees that no more than the
+# value of this throttle jobs sent to any site will be concurrently in a state
+# of being submitted.
+#
+# Default: 2
+#
+
+### throttle.host.submit=2
+#throttle.host.submit=off
+
+#
+# The Swift scheduler has the ability to limit the number of concurrent jobs
+# allowed on a site based on the performance history of that site. Each site
+# is assigned a score (initially 1), which can increase or decrease based
+# on whether the site yields successful or faulty job runs. The score for a
+# site can take values in the (0.1, 100) interval. The number of allowed jobs
+# is calculated using the following formula:
+# 	2 + score*throttle.score.job.factor
+# This means a site will always be allowed at least two concurrent jobs and
+# at most 2 + 100*throttle.score.job.factor. With a default of 4 this means
+# at least 2 jobs and at most 402.
+#
+# Default: 4
+#
+
+### throttle.score.job.factor=0.2
+#throttle.score.job.factor=off
+
+
+#
+# Limits the total number of concurrent file transfers that can happen at any
+# given time. File transfers consume bandwidth. Too many concurrent transfers
+# can cause the network to be overloaded preventing various other signalling
+# traffic from flowing properly.
+#
+# Default: 4
+#
+
+throttle.transfers=1
+#throttle.transfers=off
+
+# Limits the total number of concurrent file operations that can happen at any
+# given time. File operations (like transfers) require an exclusive connection
+# to a site. These connections can be expensive to establish. A large number
+# of concurrent file operations may cause Swift to attempt to establish many
+# such expensive connections to various sites. Limiting the number of concurrent
+# file operations causes Swift to use a small number of cached connections and
+# achieve better overall performance.
+#
+# Default: 8
+#
+
+throttle.file.operations=1
+#throttle.file.operations=off




More information about the Swift-commit mailing list