[Swift-commit] r3370 - text/parco10submission

Tue Jun 15 19:43:13 CDT 2010

Author: wozniak
Date: 2010-06-15 19:43:13 -0500 (Tue, 15 Jun 2010)
New Revision: 3370

Modified:
   text/parco10submission/Wozniak.bib
   text/parco10submission/paper.tex
Log:
Additional citations, etc.


Modified: text/parco10submission/Wozniak.bib
===================================================================

--- text/parco10submission/Wozniak.bib	2010-06-15 22:26:13 UTC (rev 3369)
+++ text/parco10submission/Wozniak.bib	2010-06-16 00:43:13 UTC (rev 3370)
@@ -1521,8 +1521,8 @@
   journal = {Lecture Notes in Computer Science},
   year = {1998},
   volume = {1459},
-  url = {citeseer.ist.psu.edu/czajkowski97resource.html}
 }
+url = {citeseer.ist.psu.edu/czajkowski97resource.html}
 
 @INPROCEEDINGS{Coallocation_1999,
   author = {Karl Czajkowski and Ian Foster and Carl Kesselman},
@@ -2909,9 +2909,10 @@
   volume = {13},
   number = {8-9},
   comment = {vonLaszewski-final.bib},
-  url = {http://www.mcs.anl.gov/~gregor/papers/vonLaszewski--cog-cpe-final.pdf}
 }
+url = {http://www.mcs.anl.gov/~gregor/papers/vonLaszewski--cog-cpe-final.pdf}
 
+
 @INPROCEEDINGS{las01pse,
   author = {Gregor von Laszewski and Ian Foster and Jarek Gawor and Peter Lane
 	and Nell Rehn and Mike Russell},
@@ -6062,7 +6063,7 @@
   journal = PTRS_A,
   volume = 363,
   number = 1833,
-  year = 2005,
+  year = 2005
 }
 {pages 1715-1728}
 
@@ -6205,3 +6206,19 @@
 }
 pages = {18},
 
+ at INPROCEEDINGS{SunConstellation_2008,
+  title = {Performance and Scalability Study of
+           Sun Constellation Cluster 'Ranger'
+           using Application-Based Benchmarks},
+  author = {Byoung-Do Kim and John E. Cazes},
+  booktitle = {Proc. TeraGrid},
+  year = 2008
+}
+
+ at INPROCEEDINGS{ReSS_2007,
+  author = {G. Garzoglio and T. Levshina and P. Mhashilkar and S. Timm},
+  title = {{ReSS}: {A} Resource Selection Service for the
+           {O}pen {S}cience {G}rid},
+  booktitle = {Proc. International Symposium of Grid Computing},
+  year = 2007
+}

Modified: text/parco10submission/paper.tex
===================================================================
--- text/parco10submission/paper.tex	2010-06-15 22:26:13 UTC (rev 3369)
+++ text/parco10submission/paper.tex	2010-06-16 00:43:13 UTC (rev 3370)
@@ -173,8 +173,8 @@
 environments. Swift scripts can be tested on a single local
 workstation. The same script can then be executed on a cluster, one or
 more grids of clusters, and on large scale parallel supercomputers
-such as the Sun Constellation (ref) or the IBM Blue Gene/P.  (section
-\ref{ExecutingSites}). Notable features include:
+such as the Sun Constellation~\cite{SunConstellation_2008}
+or the IBM Blue Gene/P~\cite{BGP_2008}. Notable features include:
 
 \item Automatic parallelization of program invocations, invoking
   programs that have no data dependencies in parallel;
@@ -249,7 +249,7 @@
 variable is stored in a single file named \verb|shane.jpeg|
 
 \begin{verbatim}
-  image photo <"shane.jpeg">;
+   image photo <"shane.jpeg">;
 \end{verbatim}
 
 Conceptually, a parallel can be drawn between Swift \emph{mapped} variables
@@ -269,15 +269,15 @@
 supplied image by a specified angle:
 
 \begin{verbatim}
-  app (image output) rotate(image input) {
-    convert "-rotate" angle @input @output;
-  }
+   app (image output) rotate(image input) {
+      convert "-rotate" angle @input @output;
+   }
 \end{verbatim}
 
 A procedure is invoked using a syntax similar to that of the C family:
 
 \begin{verbatim}
-  rotated = rotate(photo, 180);
+   rotated = rotate(photo, 180);
 \end{verbatim}
 
 While this looks like an assignment, the actual unix level execution
@@ -291,7 +291,7 @@
 which has no structure exposed to SwiftScript:
 
 \begin{verbatim}
- type image;
+   type image;
 \end{verbatim}
 
 This does not indicate that the data is unstructured; but it indicates
@@ -304,26 +304,26 @@
 script:
 
 \begin{verbatim}
- type image;
- image photo <"shane.jpeg">;
- image rotated <"rotated.jpeg">;
+   type image;
+   image photo <"shane.jpeg">;
+   image rotated <"rotated.jpeg">;
 
- app (image output) rotate(image input, int angle) {
-    convert "-rotate" angle @input @output;
- }
+   app (image output) rotate(image input, int angle) {
+      convert "-rotate" angle @input @output;
+   }
 
- rotated = rotate(photo, 180);
+   rotated = rotate(photo, 180);
 \end{verbatim}
 
 This script can be invoked from the command line as:
 
 \begin{verbatim}
-  $ ls *.jpeg
-  shane.jpeg
-  $ swift example.swift
-  ...
-  $ ls *.jpeg
-  shane.jpeg rotated.jpeg
+   $ ls *.jpeg
+   shane.jpeg
+   $ swift example.swift
+   ...
+   $ ls *.jpeg
+   shane.jpeg rotated.jpeg
 \end{verbatim}
 
 This executes a single \verb|convert| command, hiding from the user features
@@ -340,7 +340,7 @@
 all files matching a particular unix glob pattern into an array:
 
 \begin{verbatim}
-  file frames[] <filesys_mapper; pattern="*.jpeg">;
+   file frames[] <filesys_mapper; pattern="*.jpeg">;
 \end{verbatim}
 
 The \verb|foreach| construct can be used to apply the same procedure
@@ -380,14 +380,14 @@
 In this fragment, execution of procedures \verb|p| and \verb|q| can
 happen in parallel:
 \begin{verbatim}
- y=p(x);
- z=q(x);
+   y=p(x);
+   z=q(x);
 \end{verbatim}
 whilst in this fragment, execution is serialised by the variable
 \verb|y|, with procedure \verb|p| executing before \verb|q|:
 \begin{verbatim}
- y=p(x);
- z=q(y);
+   y=p(x);
+   z=q(y);
 \end{verbatim}
 
 Arrays in SwiftScript are more generally
@@ -404,13 +404,13 @@
 
 Consider the script below:
 \begin{verbatim}
- file a[];
- file b[];
- foreach v,i in a {
-   b[i] = p(v);
- }
- a[0] = r();
- a[1] = s();
+   file a[];
+   file b[];
+   foreach v,i in a {
+      b[i] = p(v);
+   }
+   a[0] = r();
+   a[1] = s();
 \end{verbatim}
 Initially, the \verb|foreach| statement will have nothing to execute,
 as the array \verb|a| has not been assigned any values. The procedures
@@ -430,15 +430,15 @@
 of as a graph of calls to other procedures.
 
 \begin{verbatim}
- (file output) process (file input) {
-   file intermediate;
-   intermediate = first(input);
-   output = second(intermediate);
- }
+   (file output) process (file input) {
+      file intermediate;
+      intermediate = first(input);
+      output = second(intermediate);
+    }
 
- file x <"x.txt">;
- file y <"y.txt">;
- y = process(x);
+   file x <"x.txt">;
+   file y <"y.txt">;
+   y = process(x);
 \end{verbatim}
 
 This will invoke two procedures, with an intermediate data file named
@@ -448,14 +448,14 @@
 procedures, not by any containing procedures. In this code block:
 
 \begin{verbatim}
- (file a, file b) A() {
-   a = A1();
-   b = A2();
- }
- file x, y, s, t;
- (x,y) = A();
- s = S(x);
- t = S(y);
+   (file a, file b) A() {
+      a = A1();
+      b = A2();
+   }
+   file x, y, s, t;
+   (x,y) = A();
+   s = S(x);
+   t = S(y);
 \end{verbatim}
 
 then a valid execution order is: \verb|A1 S(x) A2 S(y)|. The
@@ -484,29 +484,29 @@
 \emph{Complex types} may be defined using the \verb|type| keyword:
 
 \begin{verbatim}
-  type headerfile;
-  type voxelfile;
-  type volume {
-    headerfile h;
-    voxelfile v;
-  }
+   type headerfile;
+   type voxelfile;
+   type volume {
+      headerfile h;
+      voxelfile v;
+   }
 \end{verbatim}
 
 Members of a complex type can be accessed using the \verb|.| operator:
 
- \begin{verbatim}
-  volume brain;
-  o = p(brain.h);
- \end{verbatim}
+\begin{verbatim}
+   volume brain;
+   o = p(brain.h);
+\end{verbatim}
 
 Collections of files can be mapped to complex types using mappers, like
 for arrays. For example, the simple mapper used in this expression will
 map the files \verb|data.h| and \verb|data.v| to the variable members
 \verb|m.h| and \verb|m.v| respectively:
 
- \begin{verbatim}
-  volume m <simple_mapper;prefix="data">;
- \end{verbatim}
+\begin{verbatim}
+   volume m <simple_mapper;prefix="data">;
+\end{verbatim}
 
 Sometimes data may be stored in a form that does not fit with Swift's
 file-and-site model; for example, data might be stored in an RDBMS on some
@@ -520,21 +520,21 @@
 data storage and access methods to be plugged in to scripts.
 
 \begin{verbatim}
-  type file;
+   type file;
 
-  app (extern o) populateDatabase() {
-    populationProgram;
-  }
+   app (extern o) populateDatabase() {
+      populationProgram;
+   }
 
-  app (file o) analyseDatabase(extern i) {
-    analysisProgram @o;
-  }
+   app (file o) analyseDatabase(extern i) {
+      analysisProgram @o;
+   }
 
-  extern database;
-  file result <"results.txt">;
+   extern database;
+   file result <"results.txt">;
 
-  database = populateDatabase();
-  result = analyseDatabase(database);
+   database = populateDatabase();
+   result = analyseDatabase(database);
 \end{verbatim}
 
 Some external database is represented by the \verb|database| variable. The
@@ -634,7 +634,7 @@
 section N:
 
 \begin{verbatim}
- app (file output) rotate(file input, int angle)
+   app (file output) rotate(file input, int angle)
 \end{verbatim}
 
 The procedure signature declares the inputs and outputs for this
@@ -649,7 +649,7 @@
 staged in for this parameter.
 
 \begin{verbatim}
- convert "-rotate" angle @input @output;
+   convert "-rotate" angle @input @output;
 \end{verbatim}
 
 The body of the \verb|app| block defines the unix command-line that
@@ -715,17 +715,17 @@
 Sites are defined in the \emph{site catalog}, which contains descriptions
 of each site:
 
- \begin{verbatim}
-  <pool handle="tguc">
-    <gridftp
-      url="gsiftp://tg-gridftp.uc.teragrid.org" />
-    <execution provider="gt4" jobmanager="PBS"
-      url="tg-grid.uc.teragrid.org" />
-    <workdirectory>
-      /home/benc/swifttest
-    </workdirectory>
+\begin{verbatim}
+   <pool handle="tguc">
+      <gridftp
+         url="gsiftp://tg-gridftp.uc.teragrid.org" />
+      <execution provider="gt4" jobmanager="PBS"
+         url="tg-grid.uc.teragrid.org" />
+      <workdirectory>
+         /home/benc/swifttest
+      </workdirectory>
   </pool>
- \end{verbatim}
+\end{verbatim}
 
 This file may be constructed by hand or mechanically from some
 pre-existing database (such as a grid's existing discovery
@@ -842,13 +842,14 @@
 Using Swift to submit to a large number of sites poses a number of
 practical challenges that are not encountered when running on a small
 number of sites. These challenges are seen when comparing execution on
-the TeraGrid\cite{TERAGRID} with execution on the Open Science
-Grid\cite{OSG}. The set of sites which may be used is large and
+the TeraGrid~\cite{TeraGrid_2005} with execution on the Open Science
+Grid (OSG)~\cite{OSG_2007}. The set of sites which may be used is large and
 changing. It is impractical to maintain a site catalog by hand in this
 situation. In collaboration with the OSG Engagement group, Swift was
-interfaced to ReSS\cite{ReSS} so that the site catalog is generated
-from that information system. This provides a very straightforward way
-to generate a large catalog of 'fairly likely to work' sites.
+interfaced to ReSS\cite{ReSS_2007} so that the site catalog is
+generated from that information system. This provides a very
+straightforward way to generate a large catalog of sites that are
+likely to work.
 
 Having discovered those sites, two significant problems remain: the
 quality of those sites varies wildly; and user applications are not
@@ -1126,146 +1127,113 @@
 \section{Future work}
 \label{Future}
 
-\subsection{Automatic characterisation of site and application behaviour}
+Swift is an actively developed project. Current directions in Swift
+development focus on improvements for short-running tasks, massively
+parallel resources, data access mechanisms, site management, and
+provenance.
 
-TODO The replication mechanism is the beginning of this - but there is scope
-for a bunch more - eg. better statistics about jobs, sites, split by
-job name; realisation that certain types of jobs fail on a particular site,
-etc.  Note that this can fit into the engine without needing language
-changes. (ties into site selection section too?)
-
-
 \subsection{Provisioning for more granular applications}
 
-TODO: maybe this is already covered in the 'executing efficiently' section?
+In some applications (such as CNARI\cite{CNARI}) the execution time
+for a program is very short. In such circumstances, execution time can
+become dominated by GRAM and LRM overhead. A resource provisioning
+system such as Falkon\cite{FALKON} or the CoG~\cite{CoG_2001} coaster
+mechanism developed for Swift can be used to ameliorate this overhead,
+by incurring the allocation overhead once per worker node. Both of
+these mechanisms can be plugged into Swift straightforwardly through
+the CoG provider API.
 
-In some applications (such as CNARI\cite{CNARI}) the execution time for a program
-is very short (compared to what is traditionally expected for a grid
-job). In such circumstances, execution time can become dominated by
-GRAM and LRM overhead.
+\subsection{Swift on thousands of cores}
 
-A resource provisioning system such as Falkon\cite{FALKON} or the
-CoG~\cite{CoG_2001} coaster mechanism developed for Swift can be used
-to ameliorate this overhead, by incurring the allocation overhead once
-per worker node.
+Systems such as the Sun Constellation~\cite{SunConstellation_2008} or
+IBM BlueGene/P~\cite{BGP_2008} have hundreds of thousands of cores,
+and systems with millions of cores are planned. Scheduling and
+managing tasks running at this scale is a challenging problem in
+itself and relies of the rapid submission of tasks as noted
+above. Swift applications currently do run on these systems by
+scheduling Coasters workers using the standard job submission
+techniques and employing an internal IP network.
 
-Both of these mechanisms can be plugged into Swift straightforwardly
-through the CoG provider API.
+\subsection{Filesystem access optimizations}
 
 Similarly, some applications deal with files that are uncomfortably
 small for GridFTP (on the order of tens of bytes). For this, a
 lightweight file access mechanism provided by CoG Coasters can be
-substituted for GridFTP.
+substituted for GridFTP. When running on HPC resources, the thousands
+of small accesses to the filesystem may create a bottleneck.  To
+approach this problem, we have investigated application needs and
+initiated a set of Collective Data Management (CDM)~\cite{CDM_2009}
+primitives to mitigate these problems.
 
 \subsection{Provenance}
 \label{Provenance}
 
 Swift produces log information regarding the provenance of its output files.
-In an existing development module, this information can be imported into
-relational and XML databases for later querying.
+In an existing development module, this information can be imported
+into relational and XML databases for later querying. Providing an
+efficient query mechanism for such provenance data is an area of
+ongoing research; whilst many queries can be easily answered
+efficiently by a suitably indexed relational or XML database, the lack
+of support for efficient transitive queries can make some common
+queries involving either transitivity over time (such as 'find all
+data derived from input file X') or over dataset containment (such as
+'find all procedures which took an input containing the file F')
+expensive to evaluate and awkward to express.
 
-Providing an efficient query mechanism for such provenance data is an area
-of ongoing research; whilst many queries can be easily answered efficiently
-by a suitably indexed relational or XML database, the lack of support for
-efficient transitive queries can make some common queries involving
-either transitivity over time (such as 'find all data derived from input
-file X') or over dataset containment (such as 'find all procedures which
-took an input containing the file F') expensive to evaluate and awkward
-to express.
+%% \subsection{GUI workflow design tools}
 
-TODO reference the VDC from VDS\cite{VDS}
+%% In contrast to a text-oriented programming language like SwiftScript,
+%% some scientists prefer to design simple programs using GUI design tools.
+%% An example of this is the LONI Pipeline tool\cite{LONIPIPELINE}. Preliminary
+%% investigations suggest that scientific workflows designed with that tool
+%% can be straightforwardly compiled into SwiftScript and thus benefit from
+%% Swift's execution system.
 
-\subsection{GUI workflow design tools}
+%% \subsection{Site selection research}
 
-In contrast to a text-oriented programming language like SwiftScript,
-some scientists prefer to design simple programs using GUI design tools.
-An example of this is the LONI Pipeline tool\cite{LONIPIPELINE}. Preliminary
-investigations suggest that scientific workflows designed with that tool
-can be straightforwardly compiled into SwiftScript and thus benefit from
-Swift's execution system.
+%%   TODO: data affinity between sites, based on our knowledge of what is
+%% already staged on each site
 
-\subsection{The IBM BG/P}
+%%   TODO: Is anything else interesting happening here in our group?
 
-TODO: hopefully Ioan will write some section that is interesting in this
-area.
+%% \subsection{Language development}
 
-  TODO: interesting from Swift perspective:
+%%   TODO: describe how it becomes more functional as time passes, as is
+%% becoming more popular. can ref mapreduce here\cite{MAPREDUCE} eg map
+%% operator extension - looks like foreach; and maybe some other
+%% popular-ish functional language eg F\#
 
-  1. getting things running at all: use of BG/P for loosely coupled
-tasks, which is a somewhat untraditional use of such a machine; lack of
-antive LRM that is anywhere near appropraite for that (pset granularity
-only, and only running one executable) - falkon as solution to this;
+%%   TODO type-inference - implemented by Milena but not put into
+%% production.
 
-decomposition of large machine into multiple Swift sites, with 1 pset =
-1 Swift site - how some of the problems related to running on multisite
-grids are sort-of similar to problems within the BG/P - hierarchical
-scheduling of of jobs and hierarchical management of data.
+%%   TODO libraries/code reuse - some traditional language stuff there but
+%% orthogonal to that is how to express transformation catalog (which ties
+%% together language declarations with site declarations, and hence makes
+%% procedures vs sites not completely orthogonal)
 
-  2. performance
-\subsection{Site selection research}
+%%   TODO unification of procedures and functions (a historical artifact),
+%%      and possibly of mappers
 
-  TODO: data affinity between sites, based on our knowledge of what is
-already staged on each site
+%% \subsection{Debugging}
 
-  TODO: Is anything else interesting happening here in our group?
+%%  TODO: debugging of distributed system - can have a non-futures section
+%% on what is available now - logprocessing module, as well as
+%% mentioning CEDPS\cite{CEDPS} as somewhat promising(?) for the future.
 
-\subsection{Collective IO}
+%% \subsection{Swift as a library}
+%% Could existing programs execute Swift calls through a library
+%% approach?  The answer to this is certainly ``yes''. (?)
 
-%% On large systems, the shared file system is commonly provided by
-%% GPFS\cite{GPFS}. This can scale well but when used na\"ively can
-%% exhibit pathological behaviour. Early versions of Swift triggered this
-%% behaviour by targetting too much file system activity at a single
-%% working directory, so that GPFS lock contention came to dominate execution
-%% time.
+%% \subsection{Swift library / source code management}
 
+%% (TODO benc: unclear what is meant by this paragraph. it was originally in the
+%% introduction, but as it appears to talk about something which does not (yet?)
+%% exist, then it is probably better being absorbed into the future section)
 
-\subsection{Language development}
+%%   Swift does not yet have a notion of libraries. Swift programs execute as
+%% if all procedures called in the script are present in a single logical
+%% source file and are thus passed to the Swift virtual machine all at once.
 
-  TODO: describe how it becomes more functional as time passes, as is
-becoming more popular. can ref mapreduce here\cite{MAPREDUCE} eg map
-operator extension - looks like foreach; and maybe some other
-popular-ish functional language eg F\#
-
-  TODO type-inference - implemented by Milena but not put into
-production.
-
-  TODO libraries/code reuse - some traditional language stuff there but
-orthogonal to that is how to express transformation catalog (which ties
-together language declarations with site declarations, and hence makes
-procedures vs sites not completely orthogonal)
-
-  TODO unification of procedures and functions (a historical artifact),
-     and possibly of mappers
-
-\subsection{Debugging}
-
- TODO: debugging of distributed system - can have a non-futures section
-on what is available now - logprocessing module, as well as
-mentioning CEDPS\cite{CEDPS} as somewhat promising(?) for the future.
-
-\subsection{Swift as a library}
-Could existing programs execute Swift calls through a library
-approach?  The answer to this is certainly ``yes''. (?)
-
-\subsection{Swift library / source code management}
-
-
-(TODO benc: unclear what is meant by this paragraph. it was originally in the
-introduction, but as it appears to talk about something which does not (yet?)
-exist, then it is probably better being absorbed into the future section)
-
-  Swift does not yet have a notion of libraries. Swift programs execute as
-if all procedures called in the script are present in a single logical
-source file and are thus passed to the Swift virtual machine all at once.
-
-
-
-\section{Implementation status}
-
-  TODO: list how Swift can be downloaded here. describe development group?
-
-active development group; releases roughly every 2 months.
-
 \section{Comparison to Other Systems}
 \label{Related}
 
@@ -1323,25 +1291,27 @@
 
 \begin{itemize}
 
-\item Programming model: MapReduce only supports key-value pairs as input
-or output datasets and two types of computation functions - map and
-reduce; where Swift provides a type system and allows the definition
-of complex data structures and arbitrary computation procedures.
+\item Programming model: MapReduce only supports key-value pairs as
+  input or output datasets and two types of computation functions -
+  map and reduce; where Swift provides a type system and allows the
+  definition of complex data structures and arbitrary computation
+  procedures.
 
-\item Data format: in MapReduce, input and output data can be of several
-different formats, and it is also possible to define new data
-sources. Swift provides a more flexible mapping mechanism to map
-between logical data structures and various physical representations.
+\item Data format: in MapReduce, input and output data can be of
+  several different formats, and it is also possible to define new
+  data sources. Swift provides a more flexible mapping mechanism to
+  map between logical data structures and various physical
+  representations.
 
 \item Dataset partition: Swift does not automatically partition input
-datasets. Instead, datasets can be organized in structures, and
-individual items in a dataset can be transferred accordingly along
-with computations.
+  datasets. Instead, datasets can be organized in structures, and
+  individual items in a dataset can be transferred accordingly along
+  with computations.
 
 \item Execution environment: MapReduce schedules computations within a
-cluster with shared Google File System, where Swift schedules across
-distributed Grid sites that may span multiple administrative domains,
-and deals with security and resource usage policy issues.
+  cluster with shared Goojgle File System, where Swift schedules across
+  distributed Grid sites that may span multiple administrative
+  domains, and deals with security and resource usage policy issues.
 
 \end{itemize}
 
@@ -1375,12 +1345,12 @@
 the knowledge of the whole workflow graph, while in Swift, the
 structure of a workflow is constructed and expanded dynamically.
 
-Swift integrates the CoG Karajan workflow engine. Karajan provides the
-libraries and primitives for job scheduling, data transfer, and Grid
-job submission; Swift adds support for high-level abstract
-specification of large parallel computations, data abstraction, and
-workflow restart, reliable execution over multiple Grid sites, and
-(via Falkon and CoG coasters) fast job execution.
+Swift integrates with the CoG Karajan workflow engine. Karajan
+provides the libraries and primitives for job scheduling, data
+transfer, and Grid job submission; Swift adds support for high-level
+abstract specification of large parallel computations, data
+abstraction, and workflow restart, reliable execution over multiple
+Grid sites, and (via Falkon and CoG coasters) fast job execution.
 
 \section{Conclusion}
 \label{Conclusion}
@@ -1394,97 +1364,67 @@
 code that manipulates data directly. They contain instead the "data
 flow recipes" and input/output specifications of each program
 invocation such that the location and environment transparency goals
-can be implemented automatically by the Swift environment.
+can be implemented automatically by the Swift environment. This simple
+model has demonstrated many successes as a tool for scientific
+computing.
 
-TODO: Polish conclusion - was pasted here from intro and doesnt fit yet.
+\section{Implementation status}
 
+Swift is an open source project available at: \\
+{\tt http://www.ci.uchicago.edu/swift}.
+
 \section{Acknowledgments}
 
 TODO: NSF/DOE grant acknowledgements
 
-\section{TODO}
+%% \section{TODO}
 
-  Reference Swift as a follow-on project to VDL in VDS; how does XDTM fit
-    into this? Is it of any interest other than as part of the
-    project history? And is history of this project interesting? maybe so...
+%%   Reference Swift as a follow-on project to VDL in VDS; how does XDTM fit
+%%     into this? Is it of any interest other than as part of the
+%%     project history? And is history of this project interesting? maybe so...
 
 
-  Acknowledgement of all developers names?
+%%   Acknowledgement of all developers names?
 
-  info logs and kickstart logs
+%%   info logs and kickstart logs
 
-  relation to: karajan, falkon, java cog, globus needs more clearly
-defining; specifically for CoG, need to declare that it builds on top
-of that; relation to old old VDL2 papers (eg that Yong was on...)
+%%   relation to: karajan, falkon, java cog, globus needs more clearly
+%% defining; specifically for CoG, need to declare that it builds on top
+%% of that; relation to old old VDL2 papers (eg that Yong was on...)
 
-   some dude (it was Xu Du) did some stuff about BOINC - that could have a one-liner
-if it was actually written up somewhere; otherwise ignore.
-Not likely that it was written up but I will ask. (mike)
+%%    some dude (it was Xu Du) did some stuff about BOINC - that could have a one-liner
+%% if it was actually written up somewhere; otherwise ignore.
+%% Not likely that it was written up but I will ask. (mike)
 
 
-   performance: application tuning graphs; provisioning and coaster
-file access (give one-liner numbers for those); file system layout
-tuning to accomodate GPFS - can make before/after one-liners for that
-quite easily
+%%    performance: application tuning graphs; provisioning and coaster
+%% file access (give one-liner numbers for those); file system layout
+%% tuning to accomodate GPFS - can make before/after one-liners for that
+%% quite easily
 
-people who have thus far contributed directly to this written paper:
-me, wilde
+%% people who have thus far contributed directly to this written paper:
+%% me, wilde
 
-people who have thus far contributed to the Swift work described here:
-Swift core: me, wilde, hategan, milena, yong, ian
-CNARI: skenny
-OSG: mats
-Site selection: xi li, ragib
-App installed: Zhengxiong Howe
-Falkon: Ioan, zhao
-Collective IO: allan, zhao, ioan
+%% people who have thus far contributed to the Swift work described here:
+%% Swift core: me, wilde, hategan, milena, yong, ian
+%% CNARI: skenny
+%% OSG: mats
+%% Site selection: xi li, ragib
+%% App installed: Zhengxiong Howe
+%% Falkon: Ioan, zhao
+%% Collective IO: allan, zhao, ioan
 
-Users: Uri, Kubal, Hocky, UMD Student, ...
+%% Users: Uri, Kubal, Hocky, UMD Student, ...
 
-more explicit mapper description should include table of all/common mappers
+%% more explicit mapper description should include table of all/common mappers
 
-ramble about separation of parallel execution concerns and dataflow spec
-in the same way that gph has a separation of same concerns... compare contrast
+%% ramble about separation of parallel execution concerns and dataflow spec
+%% in the same way that gph has a separation of same concerns... compare contrast
 
-%\begin{thebibliography}{99}
-%\bibitem{CEDPS} - cedps
-%
-%\bibitem{MONOTONICPHD} - the phd on distributed language that defines the term 'monotonic' - although maybe it comes from elsewhere
-%
-%\bibitem{GLOBUS} globus toolkit
-%
-%\bibitem{GRAM} gram
-%
-%\bibitem{GridFTP} - gridftp
-%
-%\bibitem{TCP} - tcp
-%
-%\bibitem{CNARI} - something about the cnari paper
-%
-%\bibitem{FALKON} - falkon
-%
-%\bibitem{COG} - cog
-%
-%\bibitem{VDS} - VDS
-%
-%\bibitem{LONIPIPELINE} - loni pipeline
-%
-%\bibitem{MAPREDUCE} - mapreduce
-%
-%\bibitem{TERAGRID} - teragrid
-%
-%\bibitem{OSG} - open science grid
-%
-%\bibitem{ReSS} - ress
-%
-%\bibitem{GPFS} - GPFS
-%
-%\end{thebibliography}
-
 \bibliographystyle{elsarticle-num}
 \bibliography{paper,Wozniak} % for ACM SIGS style
 
-\verb|$Id$|
+% \verb|$Id$|
 
 \end{document}