[Swift-commit] r2404 - text/hpdc09submission

Thu Jan 8 07:23:37 CST 2009

Author: wilde
Date: 2009-01-08 07:23:36 -0600 (Thu, 08 Jan 2009)
New Revision: 2404

Modified:
   text/hpdc09submission/paper.bib
   text/hpdc09submission/paper.latex
Log:
Added "Related Work" section - taken verbatim from rejected FCGS paper.
Added a few refs.


Modified: text/hpdc09submission/paper.bib
===================================================================

--- text/hpdc09submission/paper.bib	2009-01-08 08:02:09 UTC (rev 2403)
+++ text/hpdc09submission/paper.bib	2009-01-08 13:23:36 UTC (rev 2404)
@@ -11,6 +11,34 @@
   Management}
 }
 
+ at incollection{GCRPNOVA,
+  author = {Yong Zhao and Ioan Raicu and Ian Foster an Mihael Hategan and Veronika Nefedova and Mike Wilde},
+  title = {{Scalable and Reliable Scientific Computations in Grid Environments}}
+  booktitle = {Grid Computing Research Progress},
+  isbn = {978-1-60456-404-4},
+  pages = {TODO},
+  publisher = {Nova Publisher},
+  year = 2008,
+  editor  = (TODO},
+  url = {http://people.cs.uchicago.edu/~iraicu/publications/2008_NOVA08_book-chapter_Swift.pdf),
+}
+
+ at inproceedings{SWIFTIWSW2007,
+  author = {Yong Zhao and MihaelHategan and B Clifford and I Foster and G vonLaszewski and I Raicu and T Stef-Praun and M Wilde},
+  title = {{Swift: Fast, Reliable, Loosely Coupled Parallel Computation}},
+  year = 2007,
+  booktitle = {IEEE International Workshop on Scientific Workflows}
+}
+
+ at article{LINDA,
+ title = {{Linda and Friends}},
+ author = {S Ahuja and N Carriero and D Gelernter},
+ journal = {{IEEE Computer}},
+ volume = {19(8)},
+ year = 1986,
+ pages = {26--34}
+}
+
 @article{CEDPS,
  title = {{CEDPS}},
  author = {John Smith and Jane Doe},

Modified: text/hpdc09submission/paper.latex
===================================================================
--- text/hpdc09submission/paper.latex	2009-01-08 08:02:09 UTC (rev 2403)
+++ text/hpdc09submission/paper.latex	2009-01-08 13:23:36 UTC (rev 2404)
@@ -85,7 +85,7 @@
 ``workflows'' - which we define here as the execution of a
 series of steps to perform larger domain-specific tasks. We use the
 term workflow as defined by (Taylor et. al. 2006). So we often call a
-Swift script a workflow. FIXME: Drop this paragraph/concept? Or crisp it up.
+Swift script a workflow. TODO: Drop this paragraph/concept? Or crisp it up.
 
 \subsection{Swift language concepts}
 
@@ -114,7 +114,7 @@
 
 In the rest of this section, we provide an overview of Swift's main
 concepts. Each concept is elaborated, with examples, in subsequent
-sections. [FIXME: we will need to adjust between how much to specify
+sections. [TODO: we will need to adjust between how much to specify
 here, and how much to state just before each construct is introduced.
 
 \emph{Dataset typing and mapping model}. Swift provides for the high level
@@ -168,7 +168,7 @@
 composed as pipelines of sub-functions. The basic structure of a
 composite function is a graph of calls to other functions.
 
-Recursive function calls [are / are not] supported  [Relevant? FIXME]
+Recursive function calls [are / are not] supported  [TODO: Relevant?]
 
 \emph{Variables, single assignment and data flow}. Swift variables hold
 primitive values, or references to datasets, which are files or
@@ -190,7 +190,7 @@
 
 \subsection{Rationale for creating Swift}
 
-\emph{FIXME: This section needs much polishing/condensing.}
+\emph{TODO: This section needs much polishing/condensing.}
 
 Why do we need Swift? Why create yet another scripting
 language for the execution of application programs when so many exist?
@@ -248,9 +248,9 @@
 Usage of Swift. Swift is achieving growing use on a variety of science
 problems.
 
-... FIXME: provide details.
+... TODO: provide details.
 
-In the remainder of this paper, FIXME ... we present the language,
+TODO: In the remainder of this paper, ... we present the language,
 details of the implementation, application use-cases and ongoing
 research.
 
@@ -928,9 +928,9 @@
 Another app: Rosetta on OSG? OSG was designed with a focus on
 heterogeneity between sites. Large number of sites; automatic site file
 selection; and automatic app deployment there.
-	
-\section{Swift as a framework for ongoing and experimental work}
 
+\section{Usage Experience}
+
 \subsection{Use on large numbers of sites in the Open Science Grid}
 
 TODO: get Mats to comment on this section...?
@@ -962,6 +962,8 @@
 cases. However, continued discovery of unusual failure modes drives
 the implementation of ever more fault tolerance mechanisms.
 
+\subsection{Automating Application Deployment}
+
 When running jobs on dynamically discovered sites, it is likely that
 component programs are not installed on those sites.
 
@@ -989,11 +991,13 @@
   }
 \end{verbatim}
 
-TODO: dude in CI SWFT group has also done stuff about application
-stagein - this could be mentioned, if I knew more about it...
+TODO: Zhengxiong Hou has also done stuff about application
+stagein - this could be mentioned (see Zhengxiong's email and paper)
 
 TODO: what's the conclusion (if any) of this section?
 
+\section{Swift as a framework for ongoing and experimental work}
+
 \subsection{Automatic characterisation of site and application behaviour}
 
 TODO The replication mechanism is the beginning of this - but there is scope
@@ -1095,6 +1099,107 @@
 
 active development group; releases roughly every 2 months.
 
+\section{Comparison to Other Systems}
+
+Coordination languages and systems such as Linda\cite{LINDA},
+Strand\cite{STRAN} and PCN\cite{PCN} [11] allow composition of
+distributed or parallel components, but usually require the components
+to be programmed in specific languages and linked with the systems;
+where we need to coordinate procedures that may already exist (e.g.,
+legacy applications), were coded in various programming languages and
+run in different platforms and architectures.  Linda defines a set of
+coordination primitives for concurrent agents to put and retrieve
+tuples from a shared data space called tuplespace, which serves as the
+medium for communication and coordination. Strand and PCN use
+single-assignment variables [7] as coordination mechanism. Like Linda,
+Strand and PCN are data driven in the sense that the action of sending
+and receiving data are decoupled, and processes execute only when data
+are available. The Swift system uses similar mechanism called future
+[16] for workflow evaluation and scheduling.
+
+MapReduce [8] also provides a programming models and a runtime system
+to support the processing of large scale datasets. The two key
+functions “map” and “reduce” are borrowed from functional language: a
+map function iterates over a set of items, performs a specific
+operation on each of them and produces a new set of items, where a
+reduce function performs aggregation on a set of items. The runtime
+system automatically partitions input data and schedules the execution
+of programs in a large cluster of commodity machines. The system is
+made fault tolerant by checking worker nodes periodically and
+reassigning failed jobs to other worker nodes. Sawzall [22] is an
+interpreted language that builds on MapReduce and separates the
+filtering and aggregation phases for more concise program
+specification and better parallelization.
+
+Swift and MapReduce/Sawzall share the same goals to providing a
+programming tool for the specification and execution of large parallel
+computations on large quantities of data, and facilitating the
+utilization of large distributed resources. However, the two also
+differ in many aspects:
+
+\begin{itemize}
+
+\item Programming model: MapReduce only supports key-value pairs as input
+or output datasets and two types of computation functions – map and
+reduce; where Swift provides a type system and allows the definition
+of complex data structures and arbitrary computation procedures.
+
+\item Data format: in MapReduce, input and output data can be of several
+different formats, and it is also possible to define new data
+sources. Swift provides a more flexible mapping mechanism to map
+between logical data structures and various physical representations.
+
+\item Dataset partition: Swift does not automatically partition input
+datasets. Instead, datasets can be organized in structures, and
+individual items in a dataset can be transferred accordingly along
+with computations.
+
+\item Execution environment: MapReduce schedules computations within a
+cluster with shared Google File System, where Swift schedules across
+distributed Grid sites that may span multiple administrative domains,
+and deals with security and resource usage policy issues.
+
+\end{itemize}
+
+BPEL [5] is a Web Service-based standard that specifies how a set of
+Web services interact to form a larger, composite Web Service. BPEL is
+starting to be tested in scientific contexts [10]. While BPEL can
+transfer data as XML messages, for very large scale datasets, data
+exchange must be handled via separate mechanisms. In BPEL 1.0
+specification, it does not have support for dataset
+iterations. According to Emmerich et al, an application with
+repetitive patterns on a collection of datasets could result in a BPEL
+document of 200MB in size, and BPEL is cumbersome if not impossible to
+write for computational scientists [10]. Although BPEL can use XML
+Schema to describe data types, it does not provide support for mapping
+between a logical XML view and arbitrary physical representations.
+
+DAGMan [6] provides a workflow engine that manages Condor jobs
+organized as directed acyclic graphs (DAGs) in which each edge
+corresponds to an explicit task precedence. . It has no knowledge of
+data flow, and in distributed environment works best with a
+higher-level, data-cognizant layer. It is based on static workflow
+graphs and lacks dynamic features such as iteration or conditional
+execution, although these features are being researched.
+
+Pegasus [9] is primarily a set of DAG transformers. Pegasus planners
+translate a workflow graph into a location specific DAGMan input file,
+adding stages for data staging, inter-site transfer and data
+registration. They can prune tasks for files that already exist,
+select sites for jobs, and cluster jobs based on various
+criteria. Pegasus performs graph transformation with the knowledge of
+the whole workflow graph, while in Swift, the structure of a workflow
+is constructed and expanded dynamically.
+
+Swift integrates the CoG Karajan workflow engine. Karajan provides the
+libraries and primitives for job scheduling, data transfer, and Grid
+job submission; Swift adds support for high-level abstract
+specification of large parallel computations, data abstraction, and
+workflow restart, and also (via Falkon) fast, reliable execution over
+multiple Grid sites.
+
+\section{Future Work}
+
 \section{Acknowledgements}
 
 TODO: authors beyond number 3 go here according to ACM style guide, rather