[Swift-commit] r2408 - text/hpdc09submission

Fri Jan 9 03:19:19 CST 2009

Author: benc
Date: 2009-01-09 03:19:18 -0600 (Fri, 09 Jan 2009)
New Revision: 2408

Modified:
   text/hpdc09submission/paper.latex
Log:
add skenny coauthor; tidy up import artifacts (citations and non-ascii symbols) in related work section

Modified: text/hpdc09submission/paper.latex
===================================================================

--- text/hpdc09submission/paper.latex	2009-01-09 08:22:01 UTC (rev 2407)
+++ text/hpdc09submission/paper.latex	2009-01-09 09:19:18 UTC (rev 2408)
@@ -27,7 +27,9 @@
 \and
 \alignauthor Michael Wilde \\
        \affaddr{University of Chicago Computation Institute}\\
-       \affaddr{Argonne National Laboratory}
+       \affaddr{Argonne National Laboratory} \\
+\alignauthor Sarah Kenny \\
+       \affaddr{University of Chicago Computation Institute}\\
 }
 
 \maketitle
@@ -1263,7 +1265,7 @@
 \section{Comparison to Other Systems}
 
 Coordination languages and systems such as Linda\cite{LINDA},
-Strand\cite{STRAN} and PCN\cite{PCN} [11] allow composition of
+Strand\cite{STRAN} and PCN\cite{PCN} allow composition of
 distributed or parallel components, but usually require the components
 to be programmed in specific languages and linked with the systems;
 where we need to coordinate procedures that may already exist (e.g.,
@@ -1272,22 +1274,22 @@
 coordination primitives for concurrent agents to put and retrieve
 tuples from a shared data space called tuplespace, which serves as the
 medium for communication and coordination. Strand and PCN use
-single-assignment variables [7] as coordination mechanism. Like Linda,
+single-assignment variables\cite{singleassigment} as coordination mechanism. Like Linda,
 Strand and PCN are data driven in the sense that the action of sending
 and receiving data are decoupled, and processes execute only when data
 are available. The Swift system uses similar mechanism called future
 [16] for workflow evaluation and scheduling.
 
-MapReduce [8] also provides a programming models and a runtime system
+MapReduce\cite{MapReduce} also provides a programming models and a runtime system
 to support the processing of large scale datasets. The two key
-functions “map” and “reduce” are borrowed from functional language: a
+functions \emph{map} and \emph{reduce} are borrowed from functional language: a
 map function iterates over a set of items, performs a specific
 operation on each of them and produces a new set of items, where a
 reduce function performs aggregation on a set of items. The runtime
 system automatically partitions input data and schedules the execution
 of programs in a large cluster of commodity machines. The system is
 made fault tolerant by checking worker nodes periodically and
-reassigning failed jobs to other worker nodes. Sawzall [22] is an
+reassigning failed jobs to other worker nodes. Sawzall\cite{sawzall} is an
 interpreted language that builds on MapReduce and separates the
 filtering and aggregation phases for more concise program
 specification and better parallelization.
@@ -1301,7 +1303,7 @@
 \begin{itemize}
 
 \item Programming model: MapReduce only supports key-value pairs as input
-or output datasets and two types of computation functions – map and
+or output datasets and two types of computation functions - map and
 reduce; where Swift provides a type system and allows the definition
 of complex data structures and arbitrary computation procedures.
 
@@ -1322,28 +1324,28 @@
 
 \end{itemize}
 
-BPEL [5] is a Web Service-based standard that specifies how a set of
+BPEL\cite{BPEL} is a Web Service-based standard that specifies how a set of
 Web services interact to form a larger, composite Web Service. BPEL is
-starting to be tested in scientific contexts [10]. While BPEL can
+starting to be tested in scientific contexts\cite{BPELScience}. While BPEL can
 transfer data as XML messages, for very large scale datasets, data
 exchange must be handled via separate mechanisms. In BPEL 1.0
 specification, it does not have support for dataset
 iterations. According to Emmerich et al, an application with
 repetitive patterns on a collection of datasets could result in a BPEL
 document of 200MB in size, and BPEL is cumbersome if not impossible to
-write for computational scientists [10]. Although BPEL can use XML
+write for computational scientists\cite{BPEL2}. Although BPEL can use XML
 Schema to describe data types, it does not provide support for mapping
 between a logical XML view and arbitrary physical representations.
 
-DAGMan [6] provides a workflow engine that manages Condor jobs
+DAGMan\cite{DAGman} provides a workflow engine that manages Condor jobs
 organized as directed acyclic graphs (DAGs) in which each edge
-corresponds to an explicit task precedence. . It has no knowledge of
+corresponds to an explicit task precedence. It has no knowledge of
 data flow, and in distributed environment works best with a
 higher-level, data-cognizant layer. It is based on static workflow
 graphs and lacks dynamic features such as iteration or conditional
 execution, although these features are being researched.
 
-Pegasus [9] is primarily a set of DAG transformers. Pegasus planners
+Pegasus\cite{Pegasus} is primarily a set of DAG transformers. Pegasus planners
 translate a workflow graph into a location specific DAGMan input file,
 adding stages for data staging, inter-site transfer and data
 registration. They can prune tasks for files that already exist,