[Swift-commit] r3367 - text/parco10submission

Tue Jun 15 16:44:50 CDT 2010

Author: wozniak
Date: 2010-06-15 16:44:50 -0500 (Tue, 15 Jun 2010)
New Revision: 3367

Modified:
   text/parco10submission/paper.tex
Log:
Quick pass through Section 3


Modified: text/parco10submission/paper.tex
===================================================================

--- text/parco10submission/paper.tex	2010-06-15 21:27:52 UTC (rev 3366)
+++ text/parco10submission/paper.tex	2010-06-15 21:44:50 UTC (rev 3367)
@@ -618,7 +618,8 @@
 is invoked. In order to ensure the correctness of the Swift model, the
 environment in which programs are executed needs to be constrained.
 
-A program is invoked in its own working directory; in that working
+The Swift execution model is based on the following assumptions: a
+program is invoked in its own working directory; in that working
 directory or one of its subdirectories, the program can expect to find
 all of the files that are passed as inputs to the application block;
 and on exit, it should leave all files named by that application block
@@ -630,7 +631,7 @@
 cleaned up after execution.
 
 Consider the \verb|app| declaration for the \verb|rotate| procedure in
-section N.
+section N:
 
 \begin{verbatim}
  app (file output) rotate(file input, int angle)
@@ -665,16 +666,6 @@
 although the value of that variable has not yet been computed, the
 filename where that value will go is already known.
 
-TODO comment (here?) about how this model appears somewhat constrained
-but provides a well defined atomicity that can be used for various
-reliability mechanisms, site portability, on-site efficiency tuning.
-multi-site and reliabilty are already discussed; but the on-site
-efficiency tuning (eg using GPFS and laying out files in a way that is
-sympathetic to that, potentially using Collective IO fs, and using a
-workernode local filesystem) - that discussion could go into the
-'executing efficiently' section, or a different 'executing efficiently'
-section (change titles...)
-
 \section{Execution}
 \label{Execution}
 
@@ -709,20 +700,18 @@
 as a POSIX-like file system and must be accessible through some
 \emph{file access provider}.
 
-  Two common implementations of this model are execution on the local
-system; and execution on one or more remote sites in a Globus\cite{GLOBUS}-based
-grid.
+Two common implementations of this model are execution on the local
+system; and execution on one or more remote sites in a grid managed by
+Globus~\cite{Globus_Metacomputing_1997} software. In the former case,
+a local scratch file system (such as {\tt /var/tmp}) may be used as
+the accessible file system; execution of programs is achieved by
+direct POSIX fork; and access on both sides is provided by the POSIX
+filesystem API. In the case of a grid site, commonly a shared file
+system (NFS~\cite{NFS_1985} or GPFS~\cite{GPFS_2002}) will be provided
+by the site with GridFTP~\cite{GridFTP_2005} access from the
+submitting system to the remote system; and with GRAM~\cite{GRAM_1998}
+and a local resource manager (LRM) providing an execution mechanism.
 
-In the former case, a local scratch file system (such as /var/tmp) is
-used as the accessible file system; execution of programs is achieved
-by direct unix fork; and access on both sides is provided by the POSIX
-filesystem API.
-
-In the case of a Globus-based grid site, commonly a shared file system
-(NFS or GPFS) will be provided by the site with GridFTP\cite{GridFTP} access from
-the submitting system to the remote system; and with GRAM\cite{GRAM} and a local
-resource manager (LRM) providing an execution mechanism.
-
 Sites are defined in the \emph{site catalog}, which contains descriptions
 of each site:
 
@@ -739,18 +728,19 @@
  \end{verbatim}
 
 This file may be constructed by hand or mechanically from some
-pre-existing database (such as a grid's existing discovery system).
-
+pre-existing database (such as a grid's existing discovery
+system). The site catalog is reusable and may be shared among multiple
+users of the same resources- it is not connected to the application
+script.  This separates application code from system configuration.
 The site catalog may contain definitions for multiple sites in which
-case execution will be attemted on all sites. In the presence of
+case execution will be attempted on all sites. In the presence of
 multiple sites, it is necessary to choose between the avalable sites.
-The Swift \emph{site selector} achivees this by maintaining a score for
+The Swift \emph{site selector} achieves this by maintaining a score for
 each site which determines the load that Swift will place on that site.
 As a site is successful in executing jobs, this score wil be increased
 and as the site is uncsuccessful, this score will be cdecreased. In
 addition to selecting between sites, this mechanism provides some
-dynamic rate limiting (as long as it is assumed that s site indicates
-overload by causing jobs to fail -- for example, like TCP\cite{TCP})
+dynamic rate limiting if sites fail due to overload~\cite{FTSH_2003}.
 
 This provides an empirically measured estimate of a site's ability to
 bear load, distinct from more static information elsewhere published.
@@ -760,10 +750,10 @@
 is not properly quantified by published information (for example, due
 to load caused by other users).
 
-\subsection{Executing reliably}
+\subsection{Reliable execution}
 \label{ExecutingReliably}
 
-  The functional/dataflow(?) nature of SwiftScript with a clearly defined
+The functional  nature of SwiftScript provides a clearly defined
 interface to imperative components, in addition to allowing Swift great
 flexibility in where and when it runs component programs, allows those
 imperative components to be treated as atmoic components which can be
@@ -772,56 +762,49 @@
 the runtime that need not be exposed at the language level: \emph{retries},
 \emph{restarts} and \emph{replication}.
 
-  In the simplest form of error handling in Swift, if a component
+In the simplest form of error handling in Swift, if a component
 program fails then Swift will make a second (or subsequent) attempt to
-run the program.
+run the program. In contrast to many other systems, retry here is at
+the level of the SwiftScript procedure invocation, and includes
+completely reattempting site selection, stage in, execution and stage
+out. This provides a natural way to deal with many transient errors,
+such as temporary network loss, and with many changes in site state.
 
-  In contrast to many other systems, retry here is at the level of the
-SwiftScript procedure invocation, and includes completely reattempting
-site selection, stage in, execution and stage out.
+Some errors are more permanent in nature; for example, a component
+program may have a bug that causes it to always fail given a
+particular set of inputs. In that case, Swift's retry mechanism will
+not help; each job will be tried a number of times, and each time it
+will fail resulting ultimately in the entire script failing.
 
-  This provides a very easy way to deal with many transient errors,
-such as temporary network loss, and with many negative changes in site
-state (such as a site going offline).
+In such a case, Swift provides a \emph{restart log} which encapsulates
+which procedure invocations have been succesfully completed. After
+appropriate manual intervention, a subsequent Swift run may be started
+with this restart log; this will suppress re-execution of already
+executed invocations but otherwise allow the script to continue.
 
-  Some errors are more permanent in nature; for example, a component
-program may have a bug that causes it to always fail given a particular
-set of inputs. In that case, Swift's retry mechanism will not help;
-each job will be tried a number of times, and each time it will fail
-resulting ultimately in the entire script failing.
-
-  In such a case, Swift provides a \emph{restart log} which
-encapsulates which procedure invocations have been succesfully
-completed. After appropriate manual intervention, a subsequent Swift
-run may be started with this restart log; this will suppress
-re-execution of already executed invocations but otherwise allow the
-script to continue.
-
-  A different class of failure is when jobs are submitted to a site but
+A different class of failure is when jobs are submitted to a site but
 are then enqueued for a very long time on that site. This is a failure
-in site selection, rather than in execution. Sometimes it can be a soft
-failure, in that the job will eventually run on the chosen site - the
-site selector has improperly chosen a very heavily loaded site;
+in site selection, rather than in execution. Sometimes it can be a
+soft failure, in that the job will eventually run on the chosen site -
+the site selector has improperly chosen a very heavily loaded site;
 sometimes it can be a hard failure, in that the job will never run on
 the site because it has ceased to process its job queue - the site
 selector has improperly chosen a site which is not executing jobs.
 
-  To address this situation, Swift provides for \emph{job replication}.
-After a job has been enqueued on a site for too long, a second instance
-of the job will be submitted (again undergoing site selection, stagein,
-execution and stageout); this will continue up to a defined limit (by
-default 3?).
+To address this situation, Swift provides for \emph{job replication}.
+After a job has been enqueued on a site for too long, a second
+instance of the job will be submitted (again undergoing site
+selection, stagein, execution and stageout); this will continue up to
+a defined limit. When any of those jobs begins executing, all other
+replicas will be cancelled.
 
-  When any of those jobs begins executing, all other replicas will be
-cancelled.
-
 \subsection{Avoiding job submission penalties}
 
 In many applications, the overhead of job submission through commonly
 available mechanisms, such as through GRAM into an LRM, can dominate
 the execution time. In these situations, it is helpful to combine a
-number of Swift level component program executions into a single GRAM/LRM
-submission.
+number of Swift level component program executions into a single
+GRAM/LRM submission.
 
 Swift offers two approaches: \emph{clustering} and \emph{coasters}.
 Clustering constructs job submissions containing a number of component program
@@ -836,17 +819,17 @@
 In practical usage, the automatic deployment and execution of these
 components is difficult on a number sites.
 
-However, ahead-of-time clustering can be less efficient than using coasters.
-Coasters can react much more dynamically to changing numbers of available
-worker nodes.
-When clustering, some estimation of how available remote node count
-and of job duration must be made to decide on a sensible cluster size.
-Incorrectly estimating this can (in one direction) result in an insufficient
-number of worker nodes being used, with excessive serialisation; or (in
-the other direction) result in an excessive number of GRAM job submissions.
-Coaster workers can be queued and executed before all
-of the work that they will eventually execute is known, so can get more
-work done per GRAM job submission, and get it done earlier.
+However, ahead-of-time clustering can be less efficient than using
+coasters. Coasters can react much more dynamically to changing numbers
+of available worker nodes. When clustering, some estimation of how
+available remote node count and of job duration must be made to decide
+on a sensible cluster size. Incorrectly estimating this can (in one
+direction) result in an insufficient number of worker nodes being
+used, with excessive serialisation; or (in the other direction) result
+in an excessive number of GRAM job submissions. Coaster workers can be
+queued and executed before all of the work that they will eventually
+execute is known, so can get more work done per GRAM job submission,
+and get it done earlier.
 
 Job status for coasters is reported as jobs start and end; for clustered jobs,
 job completion status is only known at the end of the entire cluster. This
@@ -854,35 +837,24 @@
 jobs) is delayed (in the worst case, activity dependant on the first job
 in a cluster must wait for all of the jobs to run).
 
-TODO: graphs or citation or something here giving numbers? two sets of
-stats - the reliability of coasters vs clusters on a range of sites
-(eg a bunch of osg engage and TG sites). also could do: diagram showing
-clustering/coasters vs some plain gram submission - CNARI app with 3s
-jobs shows this in an extreme way. Either show such a graph here or in
-CNARI app section.
+%%% Move this to Future Work
 
-TODO: comment on how this relates to Falkon
+%% \subsection{Avoiding filesystem inefficiency}
 
-TODO: vocabulary in this section - talks about 'GRAM' - is there a nicer
-way to talk about the 'underlying submission system, that underlies
-coasters and clustering' ?
+%% When running a large number of jobs on a site at once, access to the
+%% shared filesystem on that site can be a bottleneck.
 
-\subsection{Avoiding filesystem inefficiency}
+%% On large systems, the shared file system is commonly provided by
+%% GPFS\cite{GPFS}. This can scale well but when used na\"ively can
+%% exhibit pathological behaviour. Early versions of Swift triggered this
+%% behaviour by targetting too much file system activity at a single
+%% working directory, so that GPFS lock contention came to dominate execution
+%% time.
 
-When running a large number of jobs on a site at once, access to the
-shared filesystem on that site can be a bottleneck.
+%% TODO more... - work done on arranging things in fs; presumably can
+%% forwardref collective IO section if that gets written, or include that
+%% entire section here?
 
-On large systems, the shared file system is commonly provided by
-GPFS\cite{GPFS}. This can scale well but when used na\"ively can
-exhibit pathological behaviour. Early versions of Swift triggered this
-behaviour by targetting too much file system activity at a single
-working directory, so that GPFS lock contention came to dominate execution
-time.
-
-TODO more... - work done on arranging things in fs; presumably can
-forwardref collective IO section if that gets written, or include that
-entire section here?
-
 \subsection{Features to support use on dynamic resources}
 
 Using Swift to submit to a large number of sites poses a number of
@@ -1521,7 +1493,7 @@
 %\end{thebibliography}
 
 \bibliographystyle{elsarticle-num}
-\bibliography{paper} % for ACM SIGS style
+\bibliography{paper,Wozniak} % for ACM SIGS style
 
 \verb|$Id$|