[Swift-commit] r6147 - trunk/docs/siteguide
davidk at ci.uchicago.edu
davidk at ci.uchicago.edu
Wed Jan 9 17:16:09 CST 2013
Author: davidk
Date: 2013-01-09 17:16:09 -0600 (Wed, 09 Jan 2013)
New Revision: 6147
Modified:
trunk/docs/siteguide/uc3
Log:
Adding information about HDFS and staging in executable
Modified: trunk/docs/siteguide/uc3
===================================================================
--- trunk/docs/siteguide/uc3 2013-01-08 20:03:30 UTC (rev 6146)
+++ trunk/docs/siteguide/uc3 2013-01-09 23:16:09 UTC (rev 6147)
@@ -134,3 +134,137 @@
-----
<profile namespace="globus" key="condor.Requirements">UidDomain == "osg-gk.mwt2.org"</profile>
-----
+
+Installing Application Scripts on HDFS
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+NOTE: This section will only work if the application you want to use is an interpreted script (bash,
+python, perl, etc). HDFS does not have the ability to give programs an execution bit and run them. If
+the application you want to run on UC3 is a compiled executable, skip this section and read ahead.
+
+Once your simple echo test is running, you'll want to start using your own applications. One way
+to go about doing this is by using the Hadoop filesystem. This filesystem is only available on
+the UC3 Seeder Cluster. In order to limit yourself to machines that can access this filesystem,
+add the following line to your sites.xml file:
+
+-----
+<profile namespace="globus" key="condor.Requirements">UidDomain == "osg-gk.mwt2.org" && regexp("uc3-c*", Machine)</profile>
+-----
+
+Now you can install your script somewhere under the directory /mnt/hadoop/users/<yourusername>.
+Here is an example with putting everything together.
+
+.sites.xml
+-----
+<config>
+ <pool handle="uc3">
+ <execution provider="coaster" url="uc3-sub.uchicago.edu" jobmanager="local:condor"/>
+ <profile namespace="karajan" key="jobThrottle">999.99</profile>
+ <profile namespace="karajan" key="initialScore">10000</profile>
+ <profile namespace="globus" key="jobsPerNode">1</profile>
+ <profile namespace="globus" key="maxWalltime">3600</profile>
+ <profile namespace="globus" key="nodeGranularity">1</profile>
+ <profile namespace="globus" key="highOverAllocation">100</profile>
+ <profile namespace="globus" key="lowOverAllocation">100</profile>
+ <profile namespace="globus" key="slots">1000</profile>
+ <profile namespace="globus" key="maxNodes">1</profile>
+ <profile namespace="globus" key="condor.+AccountingGroup">"group_friends.{env.USER}"</profile>
+ <profile namespace="globus" key="jobType">nonshared</profile>
+ <profile namespace="globus" key="condor.Requirements">UidDomain == "osg-gk.mwt2.org" && regexp("uc3-c*", Machine)</profile>
+ <filesystem provider="local" url="none" />
+ <workdirectory>.</workdirectory>
+ </pool>
+</config>
+-----
+
+.tc.data
+-----
+uc3 bash /bin/bash null null null
+-----
+
+.myscript.swift
+-----
+type file;
+
+app (file o) myscript ()
+{
+ bash "/mnt/hadoop/users/<yourusername>/myscript.sh" stdout=@o;
+}
+
+file out[]<simple_mapper; location="outdir", prefix="myscript.",suffix=".out">;
+int ntasks = @toInt(@arg("n","1"));
+
+foreach n in [1:ntasks] {
+ out[n] = myscript();
+}
+-----
+
+./mnt/hadoop/users/<yourusername>/myscript.sh
+-----
+#!/bin/bash
+
+echo This is my script
+-----
+
+.cf
+-----
+wrapperlog.always.transfer=false
+sitedir.keep=true
+execution.retries=0
+lazy.errors=false
+status.mode=provider
+use.provider.staging=true
+provider.staging.pin.swiftfiles=false
+use.wrapper.staging=false
+-----
+
+.Example run
+-----
+$ swift -sites.file sites.xml -tc.file tc.data -config cf myscript.swift -n=10
+Swift trunk swift-r6146 cog-r3544
+
+RunID: 20130109-1657-tf01jpaa
+Progress: time: Wed, 09 Jan 2013 16:58:00 -0600
+Progress: time: Wed, 09 Jan 2013 16:58:30 -0600 Submitted:10
+Progress: time: Wed, 09 Jan 2013 16:59:00 -0600 Submitted:10
+Progress: time: Wed, 09 Jan 2013 16:59:12 -0600 Stage in:1 Submitted:9
+Final status: Wed, 09 Jan 2013 16:59:12 -0600 Finished successfully:10
+$ ls outdir/*
+outdir/myscript.0001.out outdir/myscript.0003.out outdir/myscript.0005.out outdir/myscript.0007.out outdir/myscript.0009.out
+outdir/myscript.0002.out outdir/myscript.0004.out outdir/myscript.0006.out outdir/myscript.0008.out outdir/myscript.0010.out
+-----
+
+Staging in Applications with Coaster Provider Staging
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+If you want your application to be as portable as possible, you can use
+coaster provider staging to send your application(s) to a remote node.
+By removing the condor requirements in the previous section, you will
+have more cores available. Here is a simple script that stages in and
+executes a shell script.
+
+-----
+type file;
+
+app (file o) sleep (file script, int delay)
+{
+ # chmod +x script.sh ; ./script.sh delay
+ bash "-c" @strcat("chmod +x ./", @script, " ; ./", @script, " ", delay) stdout=@o;
+}
+
+file sleep_script <"sleep.sh">;
+
+foreach i in [1:5] {
+ file o <single_file_mapper; file=@strcat("output/output.", i, ".txt")>;
+ o = sleep(sleep_script, 10);
+}
+-----
+
+Mapping our script to a file and passing it as an argument to an app function
+causes the application to be staged in. The only thing we need on the worker
+node is /bin/bash.
+
+TIP: If the program you are staging in is an executable, statically compiling it
+will increase the chances of a successful run.
+
+TIP: If the application you are staging is more complex, with multiple files,
+add everything you need into just one compressed tar file, then extract on the
+worker node.
More information about the Swift-commit
mailing list