[Swift-commit] r6551 - SwiftTutorials/OHBM_2013-06-16
wilde at ci.uchicago.edu
wilde at ci.uchicago.edu
Fri Jun 14 08:52:12 CDT 2013
Author: wilde
Date: 2013-06-14 08:52:11 -0500 (Fri, 14 Jun 2013)
New Revision: 6551
Added:
SwiftTutorials/OHBM_2013-06-16/apps.beagle-scp
SwiftTutorials/OHBM_2013-06-16/auth.defaults.example
SwiftTutorials/OHBM_2013-06-16/setup.sh
SwiftTutorials/OHBM_2013-06-16/swift.properties.ps
Modified:
SwiftTutorials/OHBM_2013-06-16/README
SwiftTutorials/OHBM_2013-06-16/TODO
SwiftTutorials/OHBM_2013-06-16/genatlas.swift
SwiftTutorials/OHBM_2013-06-16/sites.xml
SwiftTutorials/OHBM_2013-06-16/swift.properties
Log:
Demo updates; includes provider staging config for midway-to-beagle.
Modified: SwiftTutorials/OHBM_2013-06-16/README
===================================================================
--- SwiftTutorials/OHBM_2013-06-16/README 2013-06-14 04:07:57 UTC (rev 6550)
+++ SwiftTutorials/OHBM_2013-06-16/README 2013-06-14 13:52:11 UTC (rev 6551)
@@ -1,7 +1,7 @@
fMRI Data Processing demo for OHBM 2013
-Setup
+* Setup
# Get the scripts from svn
@@ -11,37 +11,59 @@
# Set default swift.properties in $HOME/.swift. Points to "." for apps (tc) and sites.xml
mkdir -p ~/.swift
- cp ~/.swift/swift.properties ~/.swift/save.swift.properties # If needed, as needed.
- cp swift.properties ~/.swift
+ cp ~/.swift/swift.properties ~/.swift/swift.properties.save # Backup yours, if needed
+ cp swift.properties ~/.swift # for ssh staging (see
+ cp swift.propertes.ps ~/.swift # for provider staging below)
+
+ # for scp staging, set your auth.defaults:
+
+ mv auth.defaults.example auth.defaults # AND EDIT to set your login and ssh key
+ cp $HOME/.ssh/auth.defaults $HOME/.ssh/auth.defaults.save # Backup as needed
+ cp auth.defaults $HOME/.ssh/auth.defaults
+
# Get swift
- module load swift
+ module load swift # or set PATH as below
-To generate test data directories:
+ # NOTE: CURRENT TESTING SHOULD USE THIS SWIFT: (with the latest provider-staging fixes)
- ./makedata data_100 100 # create 100 anatomical image volumes in directory data_100
+ PATH=/project/wilde/swift/src/0.94/cog/modules/swift/dist/swift-svn/bin:$PATH
-To run:
+ # Run setup: sets env var(s) and ensures java is loaded # <== DONT FORGET !!!
- # On localhost:
+ source setup.sh
- swift genatlas.swift # processes data/ directory by default
- swift genatlas.swift -d=data_100 # process data_100/ directory
+* Generate test data directories (example):
- # With most parallel work on midway westmere parition:
+ ./makedata data_100 100 # creates 100 anatomical image volumes in directory data_100
- swift -tc.file apps.midway genatlas.swift
+The generated data consists of links to a single file, for ease of setup and demo.
- # Choices for -tc.file are:
- apps # default, runs on localhost
- apps.beagle # on beagle, 8 nodes
- apps.midway # on midway westmere, 1 node
- apps.amazon # on Amazon EC2 - needs start-coaster-service, see below
- apps.UC3 # submits to UC3, needs apps to be sent.
+* To run:
+ # On localhost:
+ swift genatlas.swift # processes data/ directory by default
+ swift genatlas.swift -d=data_100 # process data_100/ directory
+
+ # With most parallel work on midway westmere parition:
+
+ swift -tc.file apps.midway genatlas.swift
+
+ # From midway to beagle using provider staging:
+
+ swift -config swift.properties.ps -tc.file apps.beagle genatlas.swift -d=data_100
+
+ # Choices for -tc.file are:
+
+ apps # default, runs on localhost
+ apps.beagle # on beagle, 8 nodes
+ apps.midway # on midway westmere, 1 node
+ apps.amazon # on Amazon EC2 - needs start-coaster-service, see below
+ apps.UC3 # submits to UC3, needs apps to be sent.
+
The output files of the workflow are placed under output/,
intermediate files under work/.
@@ -52,18 +74,10 @@
http://www.ci.uchicago.edu/~wilde/atlas-y.png
http://www.ci.uchicago.edu/~wilde/atlas-z.png
-Notes:
+* Notes:
The "aligin" initial stage runs 3 AIR tools under as a single shell
script, to reduce staging between these steps.
-TODO
- Show the workflow with the 3 AIR tools expanded as separate workflow
- steps, and as a Swift procedure.
- Run on other sites and Multisite
-
- Add performance monitoring and plotting
-
-
Modified: SwiftTutorials/OHBM_2013-06-16/TODO
===================================================================
--- SwiftTutorials/OHBM_2013-06-16/TODO 2013-06-14 04:07:57 UTC (rev 6550)
+++ SwiftTutorials/OHBM_2013-06-16/TODO 2013-06-14 13:52:11 UTC (rev 6551)
@@ -2,4 +2,17 @@
/scratch/midway/wilde/ds107/ds107/sub001/model/model001
+Show nested studies
+
+Show the workflow with the 3 AIR tools (align.sh) expanded as separate workflow
+steps, and as a Swift procedure.
+
+Run on other sites and Multisite
+
+Add performance monitoring and plotting
+
+
+---
+
+
Added: SwiftTutorials/OHBM_2013-06-16/apps.beagle-scp
===================================================================
--- SwiftTutorials/OHBM_2013-06-16/apps.beagle-scp (rev 0)
+++ SwiftTutorials/OHBM_2013-06-16/apps.beagle-scp 2013-06-14 13:52:11 UTC (rev 6551)
@@ -0,0 +1,4 @@
+beagle-scp sh /bin/sh null null env::AIR5=/lustre/beagle/wilde/software/AIR5.3.0
+localhost softmean /home/wilde/software/AIR5.3.0/bin/softmean
+localhost slicer /project/wilde/software/fsl-5.0.4/fsl/bin/slicer null null env::FSLOUTPUTTYPE=NIFTI
+localhost convert convert
Added: SwiftTutorials/OHBM_2013-06-16/auth.defaults.example
===================================================================
--- SwiftTutorials/OHBM_2013-06-16/auth.defaults.example (rev 0)
+++ SwiftTutorials/OHBM_2013-06-16/auth.defaults.example 2013-06-14 13:52:11 UTC (rev 6551)
@@ -0,0 +1,4 @@
+login.beagle.ci.uchicago.edu.type=key
+login.beagle.ci.uchicago.edu.username=wilde
+login.beagle.ci.uchicago.edu.key=/home/wilde/.ssh/id_rsa-swift
+login.beagle.ci.uchicago.edu.PROMPT_FOR_passphrase=this is the key
Modified: SwiftTutorials/OHBM_2013-06-16/genatlas.swift
===================================================================
--- SwiftTutorials/OHBM_2013-06-16/genatlas.swift 2013-06-14 04:07:57 UTC (rev 6550)
+++ SwiftTutorials/OHBM_2013-06-16/genatlas.swift 2013-06-14 13:52:11 UTC (rev 6551)
@@ -1,10 +1,14 @@
type file;
+# import fMRIdefs
+
type Volume {
file header;
file image;
};
+# import AIRdefs
+
app (Volume alignedVol) align (file script, Volume reference, Volume input)
{
sh @script @filename(reference.image) @filename(input.image) @filename(alignedVol.image);
@@ -25,6 +29,8 @@
convert @i @o;
}
+# Start code here
+
string dataDir = @arg("d","data");
file alignScript<"align.sh">;
Added: SwiftTutorials/OHBM_2013-06-16/setup.sh
===================================================================
--- SwiftTutorials/OHBM_2013-06-16/setup.sh (rev 0)
+++ SwiftTutorials/OHBM_2013-06-16/setup.sh 2013-06-14 13:52:11 UTC (rev 6551)
@@ -0,0 +1,7 @@
+export GLOBUS_HOSTNAME=swift.rcc.uchicago.edu
+module load java
+
+if [ $(hostname) != midway001 ]; then
+ echo ERROR: this neesd to run from swift.rcc.uchicago.edu
+ return
+fi
Modified: SwiftTutorials/OHBM_2013-06-16/sites.xml
===================================================================
--- SwiftTutorials/OHBM_2013-06-16/sites.xml 2013-06-14 04:07:57 UTC (rev 6550)
+++ SwiftTutorials/OHBM_2013-06-16/sites.xml 2013-06-14 13:52:11 UTC (rev 6551)
@@ -45,13 +45,13 @@
<profile namespace="globus" key="lowOverAllocation">100</profile>
<profile namespace="globus" key="highOverAllocation">100</profile>
<profile namespace="globus" key="providerAttributes">pbs.aprun;pbs.mpp;depth=24</profile>
- <!-- to use a beage reservation, eg:
- <profile namespace="globus" key="providerAttributes">pbs.aprun;pbs.mpp;depth=24;pbs.resource_list=advres=wilde.1768</profile>
+ <!-- to use a beage reservation, modify above tag, eg:
+ <... key="providerAttributes">pbs.aprun;pbs.mpp;depth=24;pbs.resource_list=advres=wilde.1768</profile>
-->
<profile namespace="globus" key="maxtime">3600</profile>
<profile namespace="globus" key="maxWalltime">00:05:00</profile>
<profile namespace="globus" key="userHomeOverride">/lustre/beagle/{env.USER}/swiftwork</profile>
- <profile namespace="globus" key="slots">20</profile>
+ <profile namespace="globus" key="slots">8</profile>
<profile namespace="globus" key="maxnodes">1</profile>
<profile namespace="globus" key="nodeGranularity">1</profile>
<profile namespace="karajan" key="jobThrottle">4.80</profile>
@@ -59,4 +59,36 @@
<workdirectory>/tmp/{env.USER}/swiftwork</workdirectory>
</pool>
+ <pool handle="beagle-scp">
+ <execution provider="coaster" jobmanager="ssh-cl:pbs" url="login.beagle.ci.uchicago.edu"/>
+ <profile namespace="globus" key="jobsPerNode">24</profile>
+ <profile namespace="globus" key="lowOverAllocation">100</profile>
+ <profile namespace="globus" key="highOverAllocation">100</profile>
+ <profile namespace="globus" key="providerAttributes">pbs.aprun;pbs.mpp;depth=24</profile>
+ <!-- to use a beage reservation, modify above tag, eg:
+ <... key="providerAttributes">pbs.aprun;pbs.mpp;depth=24;pbs.resource_list=advres=wilde.1768</profile>
+ -->
+ <profile namespace="globus" key="maxtime">3600</profile>
+ <profile namespace="globus" key="maxWalltime">00:05:00</profile>
+ <profile namespace="globus" key="userHomeOverride">/lustre/beagle/{env.USER}/swiftwork</profile>
+ <profile namespace="globus" key="slots">4</profile>
+ <profile namespace="globus" key="maxnodes">1</profile>
+ <profile namespace="globus" key="nodeGranularity">1</profile>
+ <profile namespace="karajan" key="jobThrottle">1.00</profile>
+ <profile namespace="karajan" key="initialScore">10000</profile>
+
+ <filesystem provider="ssh" url="login.beagle.ci.uchicago.edu"/>
+ <workdirectory>/lustre/beagle/{env.USER}/swiftwork</workdirectory>
+ </pool>
+
+ <pool handle="beagle-coast-temp">
+ <execution provider="coaster" jobmanager="ssh:local" url="login.beagle.ci.uchicago.edu"/>
+ <profile namespace="karajan" key="jobThrottle">0</profile>
+ <profile namespace="karajan" key="initialScore">10000</profile>
+
+ <filesystem provider="ssh" url="login.beagle.ci.uchicago.edu"/>
+ <workdirectory>/lustre/beagle/wilde/swiftwork</workdirectory>
+ </pool>
+
+
</config>
Modified: SwiftTutorials/OHBM_2013-06-16/swift.properties
===================================================================
--- SwiftTutorials/OHBM_2013-06-16/swift.properties 2013-06-14 04:07:57 UTC (rev 6550)
+++ SwiftTutorials/OHBM_2013-06-16/swift.properties 2013-06-14 13:52:11 UTC (rev 6551)
@@ -2,13 +2,94 @@
sites.file=sites.xml
tc.file=apps
+use.provider.staging=false
+provider.staging.pin.swiftfiles=true
+
+use.wrapper.staging=false
status.mode=provider
-use.provider.staging=true
-use.wrapper.staging=false
wrapperlog.always.transfer=true
-execution.retries=0
-lazy.errors=false
-provider.staging.pin.swiftfiles=true
+execution.retries=3
+lazy.errors=true
sitedir.keep=true
file.gc.enabled=false
#tcp.port.range=50000,51000
+
+###########################################################################
+# Throttling options #
+###########################################################################
+#
+# For the throttling parameters, valid values are either a positive integer
+# or "off" (without the quotes).
+#
+
+#
+# Limits the number of concurrent submissions for a workflow instance. This
+# throttle only limits the number of concurrent tasks (jobs) that are being
+# sent to sites, not the total number of concurrent jobs that can be run.
+# The submission stage in GRAM is one of the most CPU expensive stages (due
+# mostly to the mutual authentication and delegation). Having too many
+# concurrent submissions can overload either or both the submit host CPU
+# and the remote host/head node causing degraded performance.
+#
+# Default: 4
+#
+
+throttle.submit=4
+#throttle.submit=off
+
+#
+# Limits the number of concurrent submissions for any of the sites Swift will
+# try to send jobs to. In other words it guarantees that no more than the
+# value of this throttle jobs sent to any site will be concurrently in a state
+# of being submitted.
+#
+# Default: 2
+#
+
+### throttle.host.submit=2
+#throttle.host.submit=off
+
+#
+# The Swift scheduler has the ability to limit the number of concurrent jobs
+# allowed on a site based on the performance history of that site. Each site
+# is assigned a score (initially 1), which can increase or decrease based
+# on whether the site yields successful or faulty job runs. The score for a
+# site can take values in the (0.1, 100) interval. The number of allowed jobs
+# is calculated using the following formula:
+# 2 + score*throttle.score.job.factor
+# This means a site will always be allowed at least two concurrent jobs and
+# at most 2 + 100*throttle.score.job.factor. With a default of 4 this means
+# at least 2 jobs and at most 402.
+#
+# Default: 4
+#
+
+### throttle.score.job.factor=0.2
+#throttle.score.job.factor=off
+
+
+#
+# Limits the total number of concurrent file transfers that can happen at any
+# given time. File transfers consume bandwidth. Too many concurrent transfers
+# can cause the network to be overloaded preventing various other signalling
+# traffic from flowing properly.
+#
+# Default: 4
+#
+
+throttle.transfers=1
+#throttle.transfers=off
+
+# Limits the total number of concurrent file operations that can happen at any
+# given time. File operations (like transfers) require an exclusive connection
+# to a site. These connections can be expensive to establish. A large number
+# of concurrent file operations may cause Swift to attempt to establish many
+# such expensive connections to various sites. Limiting the number of concurrent
+# file operations causes Swift to use a small number of cached connections and
+# achieve better overall performance.
+#
+# Default: 8
+#
+
+throttle.file.operations=1
+#throttle.file.operations=off
Added: SwiftTutorials/OHBM_2013-06-16/swift.properties.ps
===================================================================
--- SwiftTutorials/OHBM_2013-06-16/swift.properties.ps (rev 0)
+++ SwiftTutorials/OHBM_2013-06-16/swift.properties.ps 2013-06-14 13:52:11 UTC (rev 6551)
@@ -0,0 +1,95 @@
+
+sites.file=sites.xml
+tc.file=apps
+
+use.provider.staging=true
+provider.staging.pin.swiftfiles=true
+
+use.wrapper.staging=false
+status.mode=provider
+wrapperlog.always.transfer=true
+execution.retries=3
+lazy.errors=true
+sitedir.keep=true
+file.gc.enabled=false
+#tcp.port.range=50000,51000
+
+###########################################################################
+# Throttling options #
+###########################################################################
+#
+# For the throttling parameters, valid values are either a positive integer
+# or "off" (without the quotes).
+#
+
+#
+# Limits the number of concurrent submissions for a workflow instance. This
+# throttle only limits the number of concurrent tasks (jobs) that are being
+# sent to sites, not the total number of concurrent jobs that can be run.
+# The submission stage in GRAM is one of the most CPU expensive stages (due
+# mostly to the mutual authentication and delegation). Having too many
+# concurrent submissions can overload either or both the submit host CPU
+# and the remote host/head node causing degraded performance.
+#
+# Default: 4
+#
+
+throttle.submit=4
+#throttle.submit=off
+
+#
+# Limits the number of concurrent submissions for any of the sites Swift will
+# try to send jobs to. In other words it guarantees that no more than the
+# value of this throttle jobs sent to any site will be concurrently in a state
+# of being submitted.
+#
+# Default: 2
+#
+
+### throttle.host.submit=2
+#throttle.host.submit=off
+
+#
+# The Swift scheduler has the ability to limit the number of concurrent jobs
+# allowed on a site based on the performance history of that site. Each site
+# is assigned a score (initially 1), which can increase or decrease based
+# on whether the site yields successful or faulty job runs. The score for a
+# site can take values in the (0.1, 100) interval. The number of allowed jobs
+# is calculated using the following formula:
+# 2 + score*throttle.score.job.factor
+# This means a site will always be allowed at least two concurrent jobs and
+# at most 2 + 100*throttle.score.job.factor. With a default of 4 this means
+# at least 2 jobs and at most 402.
+#
+# Default: 4
+#
+
+### throttle.score.job.factor=0.2
+#throttle.score.job.factor=off
+
+
+#
+# Limits the total number of concurrent file transfers that can happen at any
+# given time. File transfers consume bandwidth. Too many concurrent transfers
+# can cause the network to be overloaded preventing various other signalling
+# traffic from flowing properly.
+#
+# Default: 4
+#
+
+throttle.transfers=1
+#throttle.transfers=off
+
+# Limits the total number of concurrent file operations that can happen at any
+# given time. File operations (like transfers) require an exclusive connection
+# to a site. These connections can be expensive to establish. A large number
+# of concurrent file operations may cause Swift to attempt to establish many
+# such expensive connections to various sites. Limiting the number of concurrent
+# file operations causes Swift to use a small number of cached connections and
+# achieve better overall performance.
+#
+# Default: 8
+#
+
+throttle.file.operations=1
+#throttle.file.operations=off
More information about the Swift-commit
mailing list