[Swift-commit] r3868 - text/parco10submission

Wed Jan 5 23:31:35 CST 2011

Author: wilde
Date: 2011-01-05 23:31:34 -0600 (Wed, 05 Jan 2011)
New Revision: 3868

Modified:
   text/parco10submission/paper.tex
Log:
Many edits to the Language section.

Modified: text/parco10submission/paper.tex
===================================================================

--- text/parco10submission/paper.tex	2011-01-06 01:10:54 UTC (rev 3867)
+++ text/parco10submission/paper.tex	2011-01-06 05:31:34 UTC (rev 3868)
@@ -365,6 +365,14 @@
 
 Variables are used in Swift to name the local variables, arguments, and returns of a function. Every Swift variable is assigned a concrete data type, based on a very simple type model (with no concepts of inheritance, abstraction, etc). The outermost function in a Swift (akin to ``main'' in C) is only unique in that the variables in its environment can be declared ``global'' to make them accessible to every other function in the script.
 
+Swift data elements (atomic variables and array elements) are \emph{single-assignment}:
+they behave as futures and can be assigned at most one value during execution.
+This semantic provides the
+basis for Swift's model of parallel function evaluation and chaining.
+While Swift arrays and structures are not
+single-assignment, each of their elements are.
+
+Each variables in a Swift script is declared to be of a specific (single) type.
 Swift provides three basic classes of data types:
 
 \emph{Primitive types} are provided for integer, float, string, and boolean values by the Swift runtime. Common operators are defined for
@@ -373,78 +381,85 @@
 
 \emph{Mapped types} are data elements that refer, through a process called``mapping'' to files external to the Swift script. These are the files that will be read and written by the external application programs called by Swift.
 The mapping process can map single variables to single files, and structures and arrays to collections of files.
-
 Primitive and mapped types are called \emph{atomic types}.
 
-\hide{Swift mapped types can be
-seen as generalizations of reference types in traditional languages in
-that reference types are language representations of data stored in
-internal memory, in contrast with primitive (value) types for which no
-explicit storage is generally specified.}
+\emph{Collection types} are provided in Swift by \emph{arrays} and \emph{structures}.
+Structure fields can be of any type, while arrays contain only uniform values of a single type. One
+array type is provided for every atomic type (integer, string, boolean, and file reference).
+Arrays use numeric
+indices, but are sparse.  
+Both types of collections can contain members of atomic or collection types. Structures contain a finite number of elements. Arrays contain a varying number of elements. Structures and arrays can both recursively reference other structures and arrays in addition to atomic values. Arrays can be nested to provide multi-dimensional indexing.
 
-\hide{There is no syntactic distinction between primitive types and fileRef
-types, and the semantic differences between the two classes of
-types are minimal.}
-% clarify minimal
+Due to the dynamic, highly parallel nature of Swift, its arrays have no notion of size. Array elements can be set as a script's execution progresses. The number of elements set increases monotonically. An array is considered ``closed'' when no further statements that set an element of the array can be executed. This state is recognized at run time by information obtained from compile-time analysis of the script's call graph. Also, since all data elements have single-assignment semantics, no garbage collection issues arise.
 
-\hide{Atomic mapped types do not specify any information about the structure of
-the data. It is up to the user to assign a ``proper'' type to external
-data. Consequently Swift must and does implement nominal type equivalence.}
+Variables that are declared to be file references
+are associated with a \emph{mapper} which defines (often through a dynamic lookup process) the
+data files that are to be mapped to the variable. Array and structure elements that are declared to be file references are similarly mapped.
 
-\emph{Collection types} are provided in Swift by \emph{arrays} and \emph{structures}.
-Structure fields can be of any type, while arrays contain only uniform values of a single type. Both types of collections can contain members of atomic or collection types. Structures contain a finite number of elements. Arrays contain a varying number of elements. Structures and arrays can both recursively reference other structures and arrays in addition to atomic values. 
+Mapped type and composite type variable declarations can be annotated with a
+\emph{mapping} descriptor that specify the file(s) that are to be mapped to the Swift data element(s).
 
-Due to the dynamic, highly parallel nature of Swift, its arrays have no notion of size. Array elements can be set as a script's execution progresses. The number of elements set increases monotonically. An array is considered ``closed'' when no further statements that set an element of the array can be executed. This state is recognized at run time by information obtained from compile-time analysis of the script's call graph. Also, since all data elements have single-assignment semantics, no garbage collection issues arise.
+For example, the following line declares a variable named \verb|photo| of
+type \verb|image|. Since image is a fileRef type, it additionally declares that the
+variable refers to a single file named \verb|shane.jpeg|
 
-\subsection{Execution model}
+\begin{verbatim}
+   image photo <"shane.jpeg">;
+\end{verbatim}
 
-Swift has three types of functions:
+We can declare {\tt image} to be an \emph{external file type}:
 
-\emph{Built-in functions} are defined in the Java code of the Swift runtime system, and perform various utility functions.
+\begin{verbatim}
+   type image {};
+\end{verbatim}
 
-\emph{Atomic functions}are functions whose
-implementations are not written in Swift. Currently external functions
-are implemented as command-line applications or built-in functions defined in Java.
+The notation \verb|{}| indicates
+that the type represents a reference to a single \emph{opaque}
+file --- i.e., a reference to an external object whose structure is opaque to the Swift script. For convenience such type declarations typically use the equivalent shorthand \verb|type image;| (which new users find confusing but which has become a Swift idiom).
 
-Application wrapper functions (declared using the app keyword) 
-specify the interface (input files and parameters, and output files) of application programs in
-terms of files and other parameters.
+Mapped type variable declarations can be specified with a
+\emph{mapping} descriptor enclosed in \verb|<>| that indicates the file to be mapped to the variable.
+For example, the following line declares a variable named \verb|photo| of
+type \verb|image|. Since image is a mapped file type, it additionally declares that the
+variable refers to a single file named \verb|puppy.jpeg|:
 
-\emph{Compound functions} are functions 
-that call atomic and other compound
-functions.
+\begin{verbatim}
+   image photo <"puppy.jpeg">;
+\end{verbatim}
 
-In addition to function invocation, the Swift language provides conditional
-execution through the \emph{if} and \emph{switch} statements as well as
-a \emph{foreach} construct used for iterating over arrays of data.
+\emph{Structure types} are defined in this manner:
 
-\hide{Mihael: Note: I'm skipping \emph{iterate} on purpose. We should
-deprecate it since it's hard to understand and everything that can be
-done with it can also be done with \emph{foreach}
-}
+\begin{verbatim}
+   type image;
+   type metadata;
+   type snapshot {
+      metadata m;
+      image i;
+   }
+\end{verbatim}
 
-The \emph{if} and \emph{switch} statements are rather standard, but 
-\emph{foreach} merits more discussion. Similar to \emph{Go}
-(\cite{GOLANG}) and \emph{Python}, its control ``variables'' can be both
-an index and a value. The syntax is as follows:
+Members of a structure can be accessed using the \verb|.| operator:
 
 \begin{verbatim}
-foreach v[, k] in array {
-  ...
-}
+   snapshot s;
+   image i;
+   i = s.i;
 \end{verbatim}
 
-This is necessary because Swift does not allow the use of mutable state 
-(i.e., variables are single-assignment), therefore one would not be able
-to write statements such as \verb|i = i + 1|.
+\subsection{Execution model}
 
-Swift variables hold either primitive values, files, or collections of
-files. All variables are \emph{single-assignment} (meaning
-that they must be assigned to exactly one value during execution),
-which provides the
-basis for Swift's model of function chaining.  (Note that while Swift arrays and structures are not
-strictly single-assignment, each of their elements of are, as discussed in
-Section~\ref{ordering}.)
+Swift has three types of functions:
+
+\emph{Built-in functions} are defined in the Java code of the Swift runtime system, and perform various utility functions (numeric conversion, string manipulation, etc.) Operators (+ *, etc.) defined by the language behave similarly.
+
+\emph{Application interface functions} (declared using the app keyword) 
+specify the interface (input files and parameters, and output files) of application programs in
+terms of files and other parameters. They serve as an adapter between the Swift programming model and the mechanisms used to invoke application programs at run time.
+
+\emph{Compound functions} are functions 
+that call atomic and other compound
+functions.
+
 Through the use of futures, functions are
 executed when their input parameters have all been set from existing
 data or prior function executions.  Function calls are chained by
@@ -456,61 +471,13 @@
 rather when their input data becomes available.
 % mention that every expression in the body of a function or sub-expression is conceptually executed in parallel, and physically executed when all of their arguments have been assigned a value.
 
-Each variables in a Swift script is declared to be of a specific (single) type.
-Variables that are declared to be (or contain, in the case of aggregates) file references,
-are associated with a \emph{mapper} which defines (often through a dynamic lookup process) the
-data files that are to be mapped to the variable.
-
-%^^^\katznote{bad grammar here - not clear what this is saying}
-
-Types in Swift can be \emph{atomic} or \emph{composite}. An atomic (i.e. scalar)
-type can be either a \emph{primitive type} or a \emph{mapped type}.
-Swift provides a fixed set of primitive types, for example, \emph{integer} and
-\emph{string}. A mapped type indicates that the actual data does not
-reside in CPU addressable memory (as it would in conventional
-programming languages), but in POSIX-like files.
-
-Two composite types are provided: \emph{structures} and \emph{arrays}.
-Structures
-are similar in most respects to structure types in other languages.
-One
-array type is provided for every atomic type (integer, string, boolean, and fileRef).
-%%% ^^^ fileref, type issues.
-Arrays use numeric
-indices, but are sparse.  Arrays can be nested to provide multi-dimensional indexing.
-We often refer to instances of  composites of
-mapped types as \emph{datasets}.
-
-
-%\katznote{maybe a little figure here?}
-
-\subsection{Mapping files to data elements}
-
-Mapped type and composite type variable declarations can be annotated with a
-\emph{mapping} descriptor that indicates the file(s) that make(s) up that \emph{dataset}.
-For example, the following line declares a variable named \verb|photo| of
-type \verb|image|. Since image is a fileRef type, it additionally declares that the
-variable refers to a single file named \verb|shane.jpeg|
-
-\begin{verbatim}
-   image photo <"shane.jpeg">;
-\end{verbatim}
-
-%Conceptually, a parallel can be drawn between Swift \emph{mapped} variables
-%and Java \emph{reference types}. In both cases, there is no syntactic distinction
-%between \emph{primitive types} and \emph{mapped} types or
-%\emph{reference types}, respectively. Additionally, the semantic distinction
-%is kept to a minimum.
-
-Component programs of scripts are declared in an \emph{app
-declaration} that contains the description of the command line syntax for that
-program and a list of input and output data. An \verb|app| block
-describes a functional/dataflow style interface to imperative
-components.
-
+%vvvv
+Atomic application interface functions are defined with in an \emph{app
+declaration} that describes the command line syntax for that
+program and its input and output files.
 For example, the following example lists a function that makes use
-of the ImageMagick~\cite{ImageMagick_WWW} \verb|convert| command to rotate a
-supplied image by a specified angle:
+of the common utility {\tt convert}\cite{ImageMagick_WWW} to rotate an
+image by a specified angle:
 
 \begin{verbatim}
    app (image output) rotate(image input, int angle) {
@@ -519,41 +486,25 @@
 \end{verbatim}
 
 %\katznote{do you need to say anything about where/how convert is defined/located?}
-(The convert application itself is located through a catalog of applications specified to the runtime environment, or through a PATH lookup).
+(The {\tt convert} executable is located at run time through a catalog of applications or through a PATH environment variable).
 
-A function is invoked using a syntax similar to that of C:
+The rotate function is then invoked as follows:
 
 \begin{verbatim}
    rotated = rotate(photo, 180);
 \end{verbatim}
 
 While this statement looks like an ordinary function invocation and assignment, its execution in fact
-consists of invoking the command line specified in the \verb|app|
+consists of invoking the program specified in the \verb|app|
 declaration, with variables on the left of the assignment bound to the
 output parameters, and variables to the right of the function
 invocation passed as inputs.
 
+We can build a complete (albeit simple) Swift script:
 
-The examples above have used the type \verb|image| without a
-definition of that type. We can declare it as an \emph{external file type},
-which has no structure exposed to Swift:
-
 \begin{verbatim}
    type image;
-\end{verbatim}
-
-This does not indicate that the data is unstructured; it indicates
-that the structure of the data is not exposed to Swift.
-Swift will treat variables of this type as individual opaque
-files.
-
-With mechanisms to declare types, map variables to data files, and
-declare and invoke functions, we can build a complete (albeit simple)
-script:
-
-\begin{verbatim}
-   type image;
-   image photo <"shane.jpeg">;
+   image photo <"puppy.jpeg">;
    image rotated <"rotated.jpeg">;
 
    app (image output) rotate(image input, int angle) {
@@ -574,47 +525,52 @@
    shane.jpeg rotated.jpeg
 \end{verbatim}
 
-This executes a single \verb|convert| command, while hiding from the user features
+This executes a single \verb|convert| command, while automatically performing for the user features
 such as remote multisite execution and fault tolerance, which will be
 discussed in a later section.
 
+In addition to function invocation, the Swift language provides conditional
+execution through the \emph{if} and \emph{switch} statements as well as
+a \emph{foreach} construct used for iterating over arrays of data.
+
 \subsection{Arrays and Parallel Execution}
 \label{ArraysAndForeach}
 
-Arrays of values can be declared using the \verb|[]| suffix. An array
-can be mapped to a collection of files (one element per file) using
-a different form of mapping expression.  For example, the
-\verb|filesys_mapper| maps
-all files matching a particular glob pattern into an array:
+Arrays are declared using the \verb|[]| suffix:
 
 \begin{verbatim}
    file frames[] <filesys_mapper; pattern="*.jpeg">;
 \end{verbatim}
 
-The \verb|foreach| construct can be used to apply the same function
-call(s) to each element of an array:
+Here we used a built-in mapper called \verb|filesys_mapper| to
+all files matching the name pattern \verb|*.jpeg| to an array.
+
+The \verb|foreach| construct can be used to apply a function to each element of an array:
 \begin{verbatim}
    foreach f,ix in frames {
      output[ix] = rotate(f, 180);
    }
 \end{verbatim}
 
+\hide{
 Sequential iteration can be expressed using the \verb|iterate| construct:
 \begin{verbatim}
    step[0] = initialCondition();
    iterate ix {
      step[ix] = simulate(step[ix-1]);
-   }
+   } until (terminationCondition() );
 \end{verbatim}
 This fragment will initialize the 0-th element of the \verb|step| array
 to some initial condition, and then repeatedly run the \verb|simulate|
-function, using each execution's output as the input to the next step.
+function, using each execution's output as the input to the next step. The iteration ends when the termination condition in the {\tt until()} clause returns {\tt true}.
+}
 
+\hide{
 \subsection{Expressing functional idioms}
 
 Several common idioms seen in functional
 languages can readily expressed using Swift's
-\emph{foreach}. The \
+\emph{foreach}:
 
 \begin{description}
 
@@ -649,23 +605,21 @@
 }
 \end{verbatim}
 \end{description}
+}
 
-
 \subsection{Ordering of execution and implicit parallelism}
 \label{ordering}
 
-As previously stated, atomic variables are single-assignment,
-which means that they must be assigned to exactly one value during execution. A
-function or expression will be executed when all of its input
+Since all variables and collection elements are single-assignment,
+they can be assigned a value at most once during the execution of a script.
+A function or expression will be executed when all of its input
 parameters have been assigned values. As a result of such execution,
 more variables may become assigned, possibly allowing further parts of
 the script to execute. In this way, scripts are implicitly
-concurrent. Aside from serialization implied by these dataflow
-dependencies, execution of component programs can proceed without
-synchronization in time.
+concurrent. 
 
-In this fragment, execution of functions \verb|p| and \verb|q| can
-happen in parallel:
+In this script fragment, execution of functions \verb|p| and \verb|q| can
+occur in parallel:
 \begin{verbatim}
    y=p(x);
    z=q(x);
@@ -677,18 +631,18 @@
    z=q(y);
 \end{verbatim}
 
-%\katznote{is this common use of monotonic?  Are the arrays monotonic?  Or is the assignment of elements in the array monotonic?}
 Arrays in Swift are treated as collections of simple variables, in the sense that all array elements are single-assignment futures.
 Once the value of an array element is
 set, then it cannot change. When all the values for the array which can be set (as determined by limited flow analysis) are
 set, then the array is regarded as \emph{closed}. Statements which
 deal with the array as a whole will wait for the array to be closed
-before executing (thus, a closed array is the equivalent of a
-non-array type being assigned). An example of such an action is the expansion of the array values into an app command line).
-However, a \verb|foreach| statement
-will apply its body to elements of an array as they become known. It
-will not wait until the array is closed.
+before executing. An example of such an action is the expansion of the array values into an app command line.
+Thus, the closing of an array is the equivalent to setting an atomic variable, with respect to any statement that was waiting for the array itself to get a value. However, a \verb|foreach| statement
+will apply its body of statements to elements of an array, as they are set to a value. It
+will not wait until the array is closed. In practice this type of ``pipelining'' gives Swift scripts a high degree of parallelism at run time.
 
+Because of simplicity and regularity of the Swift data model, a high degree of implicit parallelism is achieved. For example, a foreach() statement that processes an array returned by a function may begin processing members of the returned array that have been already set, even before the function completes and returns. This yields programs that are very heavily pipelined with significant overlapping parallel activities.
+
 Consider the script below:
 \begin{verbatim}
    file a[];
@@ -699,17 +653,18 @@
    a[0] = r();
    a[1] = s();
 \end{verbatim}
-Initially, the \verb|foreach| statement will have nothing to execute,
-as the array \verb|a| has not been assigned any values. The functions
+Initially, the \verb|foreach| statement will block, with nothing to execute,
+as the array \verb|a| has not been assigned any values. At some point, in parallel, the functions
 \verb|r| and \verb|s| will execute. As soon as either of them is
 finished, the corresponding invocation of function \verb|p| will
 occur. After both \verb|r| and \verb|s| have completed, the array
 \verb|a| will be regarded as closed since no other statements in the
 script make an assignment to \verb|a|.
 
-Because of simplicity and regularity of the Swift data model, a high degree of implicit parallelism is achieved. For example, a foreach() statement that processes an array returned by a function may begin processing members of the returned array that have been already set, even before the function completes and returns. This yields programs that are very heavily pipelined with significant overlapping parallel activities.
 % show a (tested) example and if possible illustrate with a figure.
 
+\hide{
+
 \subsection{Compound functions}
 
 As with many other programming languages, functions consisting of
@@ -748,10 +703,10 @@
 a valid execution order is: \verb|A1 S(x) A2 S(y)|. The compound
 function \verb|A| does not have to have fully completed for its
 return values to be used by subsequent statements.
+}
 
-\subsection{More about types}
-\label{LanguageTypes}
 
+\hide{
 Each variable and function parameter in Swift is strongly typed.
 Types are used to structure data, to aid in debugging and program
 correctness and to influence how Swift interacts with data.
@@ -767,34 +722,8 @@
 There are a number of primitive types: \verb|int|, \verb|string|,
 \verb|float|, \verb|boolean|, which represent integers, strings,
 floating point numbers and true/false values, respectively.
+}
 
-\emph{Complex types} may be defined using the \verb|type| keyword:
-
-\begin{verbatim}
-   type headerfile;
-   type voxelfile;
-   type volume {
-      headerfile h;
-      voxelfile v;
-   }
-\end{verbatim}
-
-Members of a complex type can be accessed using the \verb|.| operator:
-
-\begin{verbatim}
-   volume brain;
-   o = p(brain.h);
-\end{verbatim}
-
-Collections of files can be mapped to complex types (arrays and structures)
-using special operators called \emph{mappers}, syntactically designate with angle brackets (< >). For example, the simple mapper used in this expression will
-map the files \verb|data.h| and \verb|data.v| to the variable members
-\verb|m.h| and \verb|m.v| respectively:
-
-\begin{verbatim}
-   volume m <simple_mapper;prefix="data">;
-\end{verbatim}
-
 \hide{ % hide description of externals till this text is refined
 
 %fixed \katznote{Swift's ``file-and-site model'' hasn't been introduced before.  I'm not even sure what it is.}
@@ -844,9 +773,8 @@
 
 \subsection{Swift mappers}
 
-Swift supports in-memory variables that are
-\emph{mapped} to files in the filesystem. This is coordinated by an
-extensible set of built-in primitives called \emph{mappers}. A representative sample of these is listed
+Swift provides an
+extensible set of built-in mapping primitives. A representative sample of these is listed
 in Table~\ref{mappertable}.
 
 \begin{table}[t]
@@ -906,11 +834,20 @@
 structured Swift variable, can represent a large, structured data
 set.
 
+Collections of files can be mapped to complex types (arrays and structures)
+using a variety of built-in mappers. For example, the \verb|simple mapper| used in this expression will
+map the files \verb|data.p| and \verb|data.m| to the variable members
+\verb|m.h| and \verb|m.v| respectively:
+
+\begin{verbatim}
+   snapshot m <simple_mapper;prefix="data">;
+\end{verbatim}
+
 \subsection{Swift runtime environment}
 
 \label{LanguageEnvironment}
 
-Notable runtime features include:
+Notable features of the Swift runtime environment include:
 
 \begin{itemize}
 
@@ -941,12 +878,9 @@
 
 \end{itemize}
 
-
-
-A Swift \verb|app| declaration describes how a component program
-is invoked. In order to ensure the correctness of the Swift model, the
-environment in which programs are executed needs to be constrained.
-
+A Swift \verb|app| declaration describes how an application program
+is invoked. In order to provide a consistent execution environment that works for virtually all application programs, the
+environment in which programs are executed needs to be constrained with a set of uniform conventions.
 The Swift execution model is based on the following assumptions: a
 program is invoked in its own working directory; in that working
 directory or one of its subdirectories, the program can expect to find
@@ -982,8 +916,7 @@
 The body of the \verb|app| block defines the command-line that will be
 executed when the function is invoked. The first token (in this case
 \verb|convert|) defines a \emph{transformation name} which is used to
-determine the executable name. Subsequent expressions, separated by
-spaces, define the command-line arguments for that executable:
+determine the executable name. Subsequent expressions define the command-line arguments for that executable:
 \verb|"-rotate"| is a string literal; angle specifies the value of the
 angle parameter; the syntax \verb|@variable| evaluates to the filename
 of the supplied variable, thus \verb|@input| and \verb|@output|
@@ -993,7 +926,7 @@
 variable has not yet been computed, the filename where that value will
 go is already available from the mapper.
 
-\section{Execution}
+\section{Execution engine}
 \label{Execution}
 
 Swift is implemented by compiling to a Karajan program\cite{Karajan}, which provides
@@ -1112,7 +1045,7 @@
 
 In such a case, Swift provides a \emph{restart log} that encapsulates
 which function invocations have been successfully completed. 
-%%%%%% What manual interv. and why???
+\mikenote{What manual interv. and why???}
 After
 appropriate manual intervention, 
 a subsequent Swift run may be started