[Swift-commit] r8412 - SwiftApps/subjobs

ketan at ci.uchicago.edu ketan at ci.uchicago.edu
Tue Mar 24 15:47:38 CDT 2015


Author: ketan
Date: 2015-03-24 15:47:38 -0500 (Tue, 24 Mar 2015)
New Revision: 8412

Modified:
   SwiftApps/subjobs/README.html
   SwiftApps/subjobs/README.txt
Log:
update tutorial with simanalyze example

Modified: SwiftApps/subjobs/README.html
===================================================================
--- SwiftApps/subjobs/README.html	2015-03-24 20:37:33 UTC (rev 8411)
+++ SwiftApps/subjobs/README.html	2015-03-24 20:47:38 UTC (rev 8412)
@@ -743,119 +743,83 @@
 <div class="sect1">
 <h2 id="_introduction">1. Introduction</h2>
 <div class="sectionbody">
-<div class="paragraph"><p>This document describes an approach to run multiple jobs over a single large
-block of compute nodes on BlueGene/Q systems.</p></div>
-<div class="paragraph"><p>The technique, called sub-block jobs lets users submit multiple, independent,
-repeated jobs within a single larger Cobalt block. Sub-block jobs is a mode of
-running jobs on BlueGene/Q systems wherein one can allocate a larger "outer"
-block of compute nodes and repeatedly submit jobs of smaller sized sub-blocks
-to this block.</p></div>
-<div class="paragraph"><p>The current package provides tools, scripts and example use-cases to run Swift
-applications in sub-block mode over the ALCF BlueGene/Q resources: <code>Vesta</code>,
-<code>Cetus</code> and <code>Mira</code>. The benefit of this approach is that the user does not have
-to invoke the sub-block specific routines involving the details of the
-underlying node interconnect hardware. Additionally, with the same Swift script
-and configuration, user gets a flexibility to run jobs in sub-block or
-non-sub-block mode depending on the scale and size of a run. The approach
-transparently allows user to run jobs directly via Swift. Users can run
-multiple <em>waves</em> of jobs asynchronously and in parallel without restarting the
-outer block.</p></div>
+<div class="paragraph"><p>The BG/Q resource manager, Cobalt, provides a mechanism to run multiple small
+jobs (sub-block jobs) repeatedly over a larger outer block. In order to run an
+application in this mode, the user must manually determine the optimal geometry
+(i.e., select a subset of the nodes based on their interconnection, allowing
+for best internode communication) and related low-level parameters of the
+system. The subjob technique addresses this challenge and enables MTC
+applications on the BG/Q. The technique lets users submit multiple,
+independent, repeated jobs within a single larger Cobalt block.</p></div>
+<div class="paragraph"><p>The Swift-subjob package provides tools, scripts and example use-cases to run
+Swift applications in subjob mode over the ALCF BG/Q resources: <code>Cetus</code> (for
+small-scale testing) and <code>Mira</code>. The framework is flexible in that the same
+configuration can be used to run subjob or non-subjob mode depending on the
+scale and size of a run. Users can run multiple <em>waves</em> of jobs asynchronously
+and in parallel.</p></div>
 </div>
 </div>
 <div class="sect1">
-<h2 id="_swift_sub_block_jobs">2. Swift sub-block jobs</h2>
+<h2 id="_quickstart">2. Quickstart</h2>
 <div class="sectionbody">
-<div class="paragraph"><p>To download the package, checkout the directory as follows:</p></div>
+<div class="paragraph"><p>Download the subjob demo package as follows:</p></div>
 <div class="listingblock">
 <div class="content">
-<pre><code>svn co https://svn.ci.uchicago.edu/svn/vdl2/SwiftApps/subjobs</code></pre>
+<pre><code>wget http://mcs.anl.gov/~ketan/subjobs.tgz</code></pre>
 </div></div>
 <div class="paragraph"><p>followed by:</p></div>
 <div class="listingblock">
 <div class="content">
-<pre><code>cd  subjobs</code></pre>
+<pre><code>tar zxf subjobs.tgz
+cd  subjobs/simanalyze/part05</code></pre>
 </div></div>
-<div class="paragraph"><p>To set up the environment:</p></div>
+<div class="paragraph"><p>To run the example application:</p></div>
 <div class="listingblock">
 <div class="content">
-<pre><code>source setup</code></pre>
+<pre><code>./runcetus.sh #on cetus
+#or
+./runmira.sh #on mira</code></pre>
 </div></div>
-<div class="paragraph"><p>To run an example application</p></div>
-<div class="listingblock">
-<div class="content">
-<pre><code>./runswift.sh</code></pre>
-</div></div>
+<div class="paragraph"><p>Another example is found in <code>subjobs/simanalyze/part06</code></p></div>
+<div class="paragraph"><p>For the details about the working of this example, see <a href="http://swift-lang.org/tutorials/localhost/tutorial.html#_part_3_analyzing_results_of_a_parallel_ensemble">here</a>.</p></div>
 </div>
 </div>
 <div class="sect1">
 <h2 id="_how_to">3. How To</h2>
 <div class="sectionbody">
-<div class="paragraph"><p>To convert an ordinary Swift application run in sub-block mode, the following changes are required:</p></div>
-<div class="paragraph"><p>First, Add bg.sh as the application invoker in place of <code>sh</code> or any other invoker. For example, if the app definition is as follows:</p></div>
+<div class="paragraph"><p>To configure a Swift application to run in subjob mode, the following
+changes are required:</p></div>
+<div class="paragraph"><p>First, add <code>bg.sh</code> as the application invoker in place of <code>sh</code> or any other
+invoker. For example, if the app definition is as follows:</p></div>
 <div class="listingblock">
 <div class="content">
 <pre><code>sh @exe @i @o arg("s","1") stdout=@sout stderr=@serr;</code></pre>
 </div></div>
-<div class="paragraph"><p>Replace the shell invocation with the bg.sh invocations like so:</p></div>
+<div class="paragraph"><p>Replace the invocation with the <code>bg.sh</code> invocations like so:</p></div>
 <div class="listingblock">
 <div class="content">
 <pre><code>bg.sh @exe @i @o arg("s","1") stdout=@sout stderr=@serr;</code></pre>
 </div></div>
-<div class="paragraph"><p>Second, add the <code>SUBBLOCK_SIZE</code> environment variable to the sites file. For example:</p></div>
+<div class="paragraph"><p>Second, export the <code>SUBBLOCK_SIZE</code> environment variable. For example:</p></div>
 <div class="listingblock">
 <div class="content">
-<pre><code><profile key="SUBBLOCK_SIZE" namespace="env">16</profile></code></pre>
+<pre><code>export SUBBLOCK_SIZE=16</code></pre>
 </div></div>
 <div class="admonitionblock">
 <table><tr>
 <td class="icon">
 <div class="title">Note</div>
 </td>
-<td class="content">The value of <code>SUBBLOCK_SIZE</code> variable must be a power of 2 greater than 8 and less than the <code>maxnodes</code> value.</td>
+<td class="content">The value of <code>SUBBLOCK_SIZE</code> variable must be a power of 2 and less than 512.</td>
 </tr></table>
 </div>
-<div class="paragraph"><p>A complete example sites file for a sub-block job run on ALCF <code>Vesta</code> is shown below:</p></div>
-<div class="listingblock">
-<div class="content">
-<pre><code><?xml version="1.0" encoding="UTF-8"?>
-<config xmlns="http://www.ci.uchicago.edu/swift/SwiftSites">
-
-<pool handle="cluster">
-<execution provider="coaster" jobmanager="local:cobalt" />
-
-<!-- "slots" determine the number of Cobalt jobs -->
-<profile namespace="globus" key="slots">1</profile>
-<profile namespace="globus" key="mode">script</profile>
-
-<profile namespace="karajan" key="jobThrottle">2.99</profile>
-<profile namespace="karajan" key="initialScore">10000</profile>
-<profile namespace="globus" key="maxwalltime">00:40:00</profile>
-<profile namespace="globus" key="walltime">2050</profile>
-<profile namespace="globus" key="maxnodes">256</profile>
-<profile namespace="globus" key="nodegranularity">256</profile>
-
-<!-- required for sub-block jobs, remove for non-sub-block jobs -->
-<profile key="SUBBLOCK_SIZE" namespace="env">16</profile>
-<profile namespace="globus" key="jobsPerNode">16</profile>
-
-<workdirectory>/tmp/swiftwork</workdirectory>
-<filesystem provider="local"/>
-
-</pool>
-</config></code></pre>
-</div></div>
-<div class="paragraph"><p>Of note are the <code>SUBBLOCK_SIZE</code> and the <code>mode</code> properties which must be present
-in the sites definition. The former defines the size of the subblock needed and
-the latter specifies that the "mode" to run the outer cobalt job would be
-<code>script</code> mode. In this particular example, we have the outer block size to be
-256 nodes whereas the subblock size is 16 nodes. This results in a total of 16
-subblocks resulting in <code>jobsPerNode</code> value to be 16.</p></div>
 <div class="admonitionblock">
 <table><tr>
 <td class="icon">
 <div class="title">Note</div>
 </td>
-<td class="content">Swift installation for sub-block jobs on Vesta and Mira machines can be found at <code>/home/ketan/swift-0.95/cog/modules/swift/dist/swift-svn/bin/swift</code></td>
+<td class="content">Swift installation for sub-block jobs on Vesta and Mira machines can be
+found at <code>/home/ketan/swift-k/dist/swift-svn/bin/swift</code></td>
 </tr></table>
 </div>
 </div>
@@ -863,9 +827,9 @@
 <div class="sect1">
 <h2 id="_use_case_applications">4. Use-Case Applications</h2>
 <div class="sectionbody">
-<div class="paragraph"><p>This section discusses some of the real-world use-case applications that are set up
-with this package. These applications are tested with subblock and non-subblock
-runs on ALCF Vesta, a 2-rack (2048 nodes) BlueGene/Q system.</p></div>
+<div class="paragraph"><p>This section discusses some of the real-world use-cases that are set up as demo
+applications with this package. These applications are tested with subblock as
+well as non-subblock runs on BG/Q system.</p></div>
 <div class="sect2">
 <h3 id="_namd">4.1. NAMD</h3>
 <div class="paragraph"><p><code>NAMD</code> is a molecular dynamics simulation code developed at
@@ -893,7 +857,6 @@
 </div></div>
 <div class="paragraph"><p>Similarly, in order to change the scale and size of runs, make changes to the
 parameters in the sites file as described in section 1 above.</p></div>
-<div class="paragraph"><p>ToDo: Visualize NAMD results with VMD.</p></div>
 </div>
 <div class="sect2">
 <h3 id="_rosetta">4.2. Rosetta</h3>
@@ -905,7 +868,6 @@
 <div class="listingblock">
 <div class="content">
 <pre><code>cd rosetta
-./runvesta.sh #run on vesta
 ./runmira.sh #run on mira</code></pre>
 </div></div>
 <div class="paragraph"><p>To change scale, size and/or inputs of the run, change the location of input
@@ -923,10 +885,19 @@
 <div class="content">
 <pre><code>(scorefile, rosetta_output, rosetta_error) = rosetta(pdb, 2);</code></pre>
 </div></div>
-<div class="paragraph"><p>In the above line, the number <code>2</code> indicates the number of <code>structs</code> to be generated by the docking. Change this value to the desired size to change the desired number of <code>structs</code>. To make changes to other parameters, make changes to the commandline as invoked in the <code>app</code> definition in the Swift script like so:</p></div>
+<div class="paragraph"><p>In the above line, the number <code>2</code> indicates the number of <code>structs</code> to be
+generated by the docking. Change this value to the desired size to change the
+desired number of <code>structs</code>. To make changes to other parameters, make changes
+to the commandline as invoked in the <code>app</code> definition in the Swift script like
+so:</p></div>
 <div class="listingblock">
 <div class="content">
-<pre><code>bgsh "/home/ketan/openmp-gnu-july16-mini/build/src/debug/linux/2.6/64/ppc64/xlc/static-mpi/FlexPepDocking.staticmpi.linuxxlcdebug" "-database" "/home/ketan/minirosetta_database" "-pep_refine" "-s" @pdb_file "-ex1" "-ex2aro" "-use_input_sc" "-nstruct" nstruct "-overwrite" "-scorefile" @_scorefile stdout=@out stderr=@err;</code></pre>
+<pre><code>bgsh \
+"/home/ketan/openmp-gnu-july16-mini/build/src/debug/linux/2.6/64/ppc64/xlc/static-mpi/FlexPepDocking.staticmpi.linuxxlcdebug" \
+"-database" "/home/ketan/minirosetta_database" \
+"-pep_refine" "-s" @pdb_file "-ex1" "-ex2aro" \
+"-use_input_sc" "-nstruct" nstruct "-overwrite" \
+"-scorefile" @_scorefile stdout=@out stderr=@err;</code></pre>
 </div></div>
 </div>
 <div class="sect2">
@@ -938,24 +909,18 @@
 <div class="listingblock">
 <div class="content">
 <pre><code>cd dock
-./runvesta.sh #run on vesta
+./runcetus.sh #run on cetus
 ./runmira.sh #run on mira</code></pre>
 </div></div>
 </div>
-<div class="sect2">
-<h3 id="_gridpack">4.4. GridPack</h3>
-<div class="paragraph"><p>GridPack is a simulation package designed for powergrid design application.</p></div>
 </div>
-<div class="sect2">
-<h3 id="_halo">4.5. HALO</h3>
-<div class="paragraph"><p>HALO is a astrophysics application.</p></div>
 </div>
-</div>
-</div>
 <div class="sect1">
 <h2 id="_internals">5. Internals</h2>
 <div class="sectionbody">
-<div class="paragraph"><p>The key driver of the Swift sub-block jobs is a script called <code>bg.sh</code> that does the sub-block jobs calculations and othe chores for the users. The script looks as follows:</p></div>
+<div class="paragraph"><p>The key driver of the Swift sub-block jobs is a script called <code>bg.sh</code> that does
+the sub-block jobs calculations and othe chores for the users. The script looks
+as follows:</p></div>
 <div class="listingblock">
 <div class="content">
 <pre><code>#!/bin/bash
@@ -1039,33 +1004,11 @@
 </div>
 </div>
 <div class="sect1">
-<h2 id="_sub_block_limitations">6. Sub-block limitations</h2>
+<h2 id="_further_information">6. Further Information</h2>
 <div class="sectionbody">
-<div class="paragraph"><p>There are currently the following two limitations with sub-block jobs:</p></div>
 <div class="olist arabic"><ol class="arabic">
 <li>
 <p>
-The maximum size of the outer Cobalt job must not exceed 512 nodes, ie. half
-of a hardware rack.
-</p>
-</li>
-<li>
-<p>
-The successive subjobs must be submitted with a gap of at least 3 seconds in
-between. This means for a large number of shorter than 5 seconds jobs, the
-system will be underutilized. Consequently, subblocks are suitable for
-tasks which are more than a couple of minutes in duration.
-</p>
-</li>
-</ol></div>
-</div>
-</div>
-<div class="sect1">
-<h2 id="_further_information">7. Further Information</h2>
-<div class="sectionbody">
-<div class="olist arabic"><ol class="arabic">
-<li>
-<p>
 More information about Swift can be found <a href="http://swift-lang.org/main">here</a>.
 </p>
 </li>
@@ -1076,7 +1019,7 @@
 </li>
 <li>
 <p>
-More about IBM BlueGene sub-block jobs can be found <a href="http://www.alcf.anl.gov/files/ensemble_jobs_0.pdf">here</a> (warning: pdf).
+More about IBM BlueGene sub-block jobs can be found <a href="http://www.alcf.anl.gov/files/ensemble_jobs_0.pdf">here</a> (PDF).
 </p>
 </li>
 </ol></div>
@@ -1086,7 +1029,7 @@
 <div id="footnotes"><hr /></div>
 <div id="footer">
 <div id="footer-text">
-Last updated 2014-11-07 10:58:42 CST
+Last updated 2015-03-24 15:42:35 CDT
 </div>
 </div>
 </body>

Modified: SwiftApps/subjobs/README.txt
===================================================================
--- SwiftApps/subjobs/README.txt	2015-03-24 20:37:33 UTC (rev 8411)
+++ SwiftApps/subjobs/README.txt	2015-03-24 20:47:38 UTC (rev 8412)
@@ -4,29 +4,15 @@
 Introduction
 ------------
 
-Argonne Leadership Computing Facility (ALCF) hosts a leadership class (among
-the most powerful in the world) supercomputer, +Mira+, which is a
-50K-node IBM Blue Gene/Q (BG/Q) with peak performance of 10 PetaFLOPS.
-
-Many applications are originally developed with small- and medium- size
-execution on regular clusters  in mind. Some of these applications are later
-used at large-scale in a workflow and/or many-task computing (MTC) style.  (MTC
-is an emerging computation style wherein the computation consists of a large
-number of medium-sized, semi-dependent tasks implemented as ordinary programs.)
-
 The BG/Q resource manager, Cobalt, provides a mechanism to run multiple small
 jobs (sub-block jobs) repeatedly over a larger outer block. In order to run an
-application in this mode, the user has to manually determine the optimal
-geometry (i.e., select a subset of the nodes based on their interconnection,
-allowing for best internode communication) and related low-level parameters of
-the system. The subjob technique addresses this challenge and enables MTC
-application to be run on the BG/Q.  
+application in this mode, the user must manually determine the optimal geometry
+(i.e., select a subset of the nodes based on their interconnection, allowing
+for best internode communication) and related low-level parameters of the
+system. The subjob technique addresses this challenge and enables MTC
+applications on the BG/Q. The technique lets users submit multiple,
+independent, repeated jobs within a single larger Cobalt block. 
 
-This document describes the subjob approach to run multiple jobs over a single
-large block of compute nodes on Blue Gene/Q systems. The technique lets users
-submit multiple, independent, repeated jobs within a single larger Cobalt
-block. 
-
 The Swift-subjob package provides tools, scripts and example use-cases to run
 Swift applications in subjob mode over the ALCF BG/Q resources: +Cetus+ (for
 small-scale testing) and +Mira+. The framework is flexible in that the same
@@ -37,29 +23,30 @@
 Quickstart
 -----------
 
-To download the package, SVN checkout the directory as follows:
+Download the subjob demo package as follows:
 
 ----
-svn co https://svn.ci.uchicago.edu/svn/vdl2/SwiftApps/subjobs
+wget http://mcs.anl.gov/~ketan/subjobs.tgz
 ----
 
 followed by:
 
 ----
-cd  subjobs
+tar zxf subjobs.tgz
+cd  subjobs/simanalyze/part05
 ----
 
-To set up the environment:
+To run the example application: 
 
 ----
-source setup
+./runcetus.sh #on cetus
+#or
+./runmira.sh #on mira
 ----
 
-To run an example application (mpicatnap, described in section x.y)
+Another example is found in +subjobs/simanalyze/part06+
 
-----
-./runswift.sh
-----
+For the details about the working of this example, see Swift tutorial http://swift-lang.org/tutorials/localhost/tutorial.html#_part_3_analyzing_results_of_a_parallel_ensemble[here].
 
 How To
 -------
@@ -86,7 +73,7 @@
 export SUBBLOCK_SIZE=16
 ----
 
-NOTE: The value of +SUBBLOCK_SIZE+ variable must be a power of 2 greater than 8 and less than 512.
+NOTE: The value of +SUBBLOCK_SIZE+ variable must be a power of 2 and less than 512.
 
 ////
 A complete example sites file for a sub-block job run on ALCF +Vesta+ is shown below:
@@ -231,6 +218,7 @@
 ./runmira.sh #run on mira
 ----
 
+////
 GridPack
 ~~~~~~~~
 GridPack is a simulation package designed for powergrid design application.
@@ -238,6 +226,7 @@
 HALO
 ~~~~
 HALO is an astrophysics application.
+////
 
 Internals
 ----------
@@ -269,5 +258,5 @@
 
 . More information about Swift can be found http://swift-lang.org/main[here].
 . More about ALCF can be found http://www.alcf.anl.gov[here].
-. More about IBM BlueGene sub-block jobs can be found http://www.alcf.anl.gov/files/ensemble_jobs_0.pdf[here] (warning: pdf).
+. More about IBM BlueGene sub-block jobs can be found http://www.alcf.anl.gov/files/ensemble_jobs_0.pdf[here] (PDF).
 




More information about the Swift-commit mailing list