[Swift-commit] r3855 - text/parco10submission
noreply at svn.ci.uchicago.edu
noreply at svn.ci.uchicago.edu
Wed Jan 5 13:37:04 CST 2011
Author: wilde
Date: 2011-01-05 13:37:04 -0600 (Wed, 05 Jan 2011)
New Revision: 3855
Modified:
text/parco10submission/paper.tex
Log:
Edits to abstract and intro.
Modified: text/parco10submission/paper.tex
===================================================================
--- text/parco10submission/paper.tex 2011-01-05 19:28:07 UTC (rev 3854)
+++ text/parco10submission/paper.tex 2011-01-05 19:37:04 UTC (rev 3855)
@@ -137,6 +137,7 @@
parallel execution of application programs. Swift scripts are location-independent and automatically parallelized by exploiting the maximal concurrency permitted by their data dependencies and by resource availability.
As a language, Swift is simpler than most scripting languages because it does not replicate the capabilities that existing scripting languages like Perl, Python, and shells do very well, but instead makes it easy to call such scripts as small applications.
+% say: it has fewer statements, limited data types and a compact library of useful support primitives. It can be extended using built-in functions coded in Java, and by mappers coded as Java built-ins or as external scripts. These functions execute in parallel as part of expression evaluation in the same mapper as externally called application programs or scripts do.
Swift can execute scripts that perform tens of thousands of program
invocations on highly parallel resources, and handle the unreliable
and dynamic aspects of wide-area distributed resources. Such issues are handled by Swift's runtime system, and are not manifest in the user's scripts.
@@ -171,33 +172,42 @@
%%% \begin{msection}
+% said already:
The main goal of Swift is to allow the composition of coarse grained
processes, and to parallelize and manage the execution of scripts
on distributed collections of parallel resources.
-Swift is implicitly parallel and distributed, in that the user does not explicitly code parallel behavior, nor is any knowledge of runtime execution locations encoded into a Swift script. The function model on which Swift is based ensures that execution of Swift scripts is deterministic, thus simplifying the scripting process.
+%keep:
+Swift is implicitly parallel and distributed, in that the user does not explicitly code either parallel behavior or synchronization (or mutual exclusion); does not code explicit data transfer of files to the execution sites of jobs and back. In fact no knowledge of runtime execution locations is directly specified in a Swift script. The function model on which Swift is based ensures that execution of Swift scripts is deterministic, thus simplifying the scripting process.
-Having the results of a Swift script be independent of the way the processes
-are parallelized implies that the processes must, for the same input,
+%adjust: address degrees of determinism
+Having the results of a Swift script be independent of the way that its function invocations
+are parallelized implies that the functions must, for the same input,
produce the same output, irrespective of the time, order or location in
which they are ``executed''. This characteristic is reminiscent of
referential transparency, and one may readily extend the concept to
encompass arbitrary processes without difficulty.
+%keep: discuss kthread/function duality; dont confuse with the parameter issue?
Swift enables users to specify process composition by representing processes as functions, with input data files and process parameters become function parameters and output data files become function return values.
-
+%keep
The exact number of processing units available on such shared resources
varies with time. In order to take advantage of as many processing units
as possible during the execution of a Swift program, it is necessary to
be flexible in the way the execution of individual processes is
parallelized.
+% consider: where to define kthreads and jthreads; how to describe the function/process duality; where to discuss the implementation
+
+%keep: how best to state process/function duality? Each invocationu of a function is a process; all functions run in parallel; foreach loops are unfolded and run in parallel; essentinally the entire program is unfolded. (Note: itereate stops this behavior and is thus useful; address scalability issues of this and future graph partioning; how throttling keeps this manageable.
This duality allows the formal specification of process behavior. In the following Swift statement, the semantics are defined in terms of the specification of the function
-``rotate'' when supplied with specific parameter types.:
+``rotate'' when supplied with specific parameter types:
\begin{verbatim}
rotatedImage = rotate(image, angle);
\end{verbatim}
+%Q: should we have any code examples in the intro? eg: 1 call, 1 foreach?
+
\hide{
% and whether the
% implementation can be described as a ``library call'' or a ``program
@@ -209,33 +219,41 @@
applications. They can equally consist of library calls or functions
written in Swift itself, as long as they are side-effect free.
-%A soft
-%restriction arises from the desire to distribute the execution of
+%A soft restriction arises from the desire to distribute the execution of
%functions across a collection of heterogeneous resources, which, with
%the advent of projects such as TeraGrid, suggests an implementation in
%which functions are applications readily executable on them through the
%careful employment of grid middleware.
}
+%keep:
+Note that some Swift scripts are specified as library calls.
+
+%decide: is referential transparency relevant?
Having established the constraint that Swift functions must in general
be referentially transparent, and in order to preserve referential
transparency at different levels of abstractions within the language, it
follows that the appropriate form for the Swift language is functional.
+%keep: discuss determinism, side effects, referential transparency, and interleaving???
+%I think the KEY aspect of "functional" is (a) in-out tracking for distribtability and side effect management and (b) the write-once-future model for all data.
+
We choose to make the Swift language purely functional (i.e., we disallow
side effects in the language) in order to prevent the difficulties that
arise from having to track side effects to ensure determinism in complex
concurrency scenarios.
+%discuss: is lazy vs eager relevant? What does it really mean to swift?
Functional programming allows consistent
implementations of evaluation strategies different from the widespread
eager evaluation, as seen in lazily evaluated languages
such as Haskell \cite{Haskell}.
-
+
+%Keep: KEY:
In order to achieve automatic
parallelization in Swift is based on the synchronization construct of \emph{futures}\cite{Futures}, which
results in eager parallelism. Every Swift variable (including every members of structures and arrays) is a write-once future.
-% In this process, we trade the ability to efficiently deal with infinite structures for the ability to minimize computation time.
+% consider: In this process, we trade the ability to efficiently deal with infinite structures for the ability to minimize computation time. I think this pertains to the "unroll everything" strategy.
Using a futures-based evaluation strategy has an enormous benefit:
automatic parallelization is achieved without the need for
dependency analysis, which would significantly complicate the Swift implementation.
More information about the Swift-commit
mailing list