[Swift-commit] r2408 - text/hpdc09submission
noreply at svn.ci.uchicago.edu
noreply at svn.ci.uchicago.edu
Fri Jan 9 03:19:19 CST 2009
Author: benc
Date: 2009-01-09 03:19:18 -0600 (Fri, 09 Jan 2009)
New Revision: 2408
Modified:
text/hpdc09submission/paper.latex
Log:
add skenny coauthor; tidy up import artifacts (citations and non-ascii symbols) in related work section
Modified: text/hpdc09submission/paper.latex
===================================================================
--- text/hpdc09submission/paper.latex 2009-01-09 08:22:01 UTC (rev 2407)
+++ text/hpdc09submission/paper.latex 2009-01-09 09:19:18 UTC (rev 2408)
@@ -27,7 +27,9 @@
\and
\alignauthor Michael Wilde \\
\affaddr{University of Chicago Computation Institute}\\
- \affaddr{Argonne National Laboratory}
+ \affaddr{Argonne National Laboratory} \\
+\alignauthor Sarah Kenny \\
+ \affaddr{University of Chicago Computation Institute}\\
}
\maketitle
@@ -1263,7 +1265,7 @@
\section{Comparison to Other Systems}
Coordination languages and systems such as Linda\cite{LINDA},
-Strand\cite{STRAN} and PCN\cite{PCN} [11] allow composition of
+Strand\cite{STRAN} and PCN\cite{PCN} allow composition of
distributed or parallel components, but usually require the components
to be programmed in specific languages and linked with the systems;
where we need to coordinate procedures that may already exist (e.g.,
@@ -1272,22 +1274,22 @@
coordination primitives for concurrent agents to put and retrieve
tuples from a shared data space called tuplespace, which serves as the
medium for communication and coordination. Strand and PCN use
-single-assignment variables [7] as coordination mechanism. Like Linda,
+single-assignment variables\cite{singleassigment} as coordination mechanism. Like Linda,
Strand and PCN are data driven in the sense that the action of sending
and receiving data are decoupled, and processes execute only when data
are available. The Swift system uses similar mechanism called future
[16] for workflow evaluation and scheduling.
-MapReduce [8] also provides a programming models and a runtime system
+MapReduce\cite{MapReduce} also provides a programming models and a runtime system
to support the processing of large scale datasets. The two key
-functions “map” and “reduce” are borrowed from functional language: a
+functions \emph{map} and \emph{reduce} are borrowed from functional language: a
map function iterates over a set of items, performs a specific
operation on each of them and produces a new set of items, where a
reduce function performs aggregation on a set of items. The runtime
system automatically partitions input data and schedules the execution
of programs in a large cluster of commodity machines. The system is
made fault tolerant by checking worker nodes periodically and
-reassigning failed jobs to other worker nodes. Sawzall [22] is an
+reassigning failed jobs to other worker nodes. Sawzall\cite{sawzall} is an
interpreted language that builds on MapReduce and separates the
filtering and aggregation phases for more concise program
specification and better parallelization.
@@ -1301,7 +1303,7 @@
\begin{itemize}
\item Programming model: MapReduce only supports key-value pairs as input
-or output datasets and two types of computation functions – map and
+or output datasets and two types of computation functions - map and
reduce; where Swift provides a type system and allows the definition
of complex data structures and arbitrary computation procedures.
@@ -1322,28 +1324,28 @@
\end{itemize}
-BPEL [5] is a Web Service-based standard that specifies how a set of
+BPEL\cite{BPEL} is a Web Service-based standard that specifies how a set of
Web services interact to form a larger, composite Web Service. BPEL is
-starting to be tested in scientific contexts [10]. While BPEL can
+starting to be tested in scientific contexts\cite{BPELScience}. While BPEL can
transfer data as XML messages, for very large scale datasets, data
exchange must be handled via separate mechanisms. In BPEL 1.0
specification, it does not have support for dataset
iterations. According to Emmerich et al, an application with
repetitive patterns on a collection of datasets could result in a BPEL
document of 200MB in size, and BPEL is cumbersome if not impossible to
-write for computational scientists [10]. Although BPEL can use XML
+write for computational scientists\cite{BPEL2}. Although BPEL can use XML
Schema to describe data types, it does not provide support for mapping
between a logical XML view and arbitrary physical representations.
-DAGMan [6] provides a workflow engine that manages Condor jobs
+DAGMan\cite{DAGman} provides a workflow engine that manages Condor jobs
organized as directed acyclic graphs (DAGs) in which each edge
-corresponds to an explicit task precedence. . It has no knowledge of
+corresponds to an explicit task precedence. It has no knowledge of
data flow, and in distributed environment works best with a
higher-level, data-cognizant layer. It is based on static workflow
graphs and lacks dynamic features such as iteration or conditional
execution, although these features are being researched.
-Pegasus [9] is primarily a set of DAG transformers. Pegasus planners
+Pegasus\cite{Pegasus} is primarily a set of DAG transformers. Pegasus planners
translate a workflow graph into a location specific DAGMan input file,
adding stages for data staging, inter-site transfer and data
registration. They can prune tasks for files that already exist,
More information about the Swift-commit
mailing list