[Swift-commit] r3943 - text/parco10submission

Mon Jan 10 13:20:00 CST 2011

Author: wilde
Date: 2011-01-10 13:20:00 -0600 (Mon, 10 Jan 2011)
New Revision: 3943

Modified:
   text/parco10submission/paper.tex
Log:
Update for missing SEM image. Need to see if we lost text in conflct resolutions.

Modified: text/parco10submission/paper.tex
===================================================================

--- text/parco10submission/paper.tex	2011-01-10 19:19:04 UTC (rev 3942)
+++ text/parco10submission/paper.tex	2011-01-10 19:20:00 UTC (rev 3943)
@@ -1342,18 +1342,17 @@
 \section{Performance Characteristics}
 \label{Performance}
 
-We present here a few additional measurements to supplement
-those previously published. \katznote{need to site something here, maybe \cite{Swift_2007}?}
+We present here a few additional measurements of Swift performance, and highlight and include a few previously published results.
 
-First, we measured the ability of Swift to support many user tasks on a
-single system image.  We used Swift to submit up to 2,000
-tasks to Thwomp, a 16-core x86-based Linux compute server at Argonne
+\subsection{Synthetic benchmark results}
+First, we measure the ability of Swift to support many user tasks on a
+single local system.  In Test A, we used Swift to submit up to 2,000
+tasks to a 16-core x86-based Linux compute server at Argonne
 National Laboratory.  Each job in the batch was an identical, simple
 single-processor job that executed for the given duration and
 performed application input and output at 1 byte each.  The total
 execution time was measured.  This was compared to the total amount of
-core-time consumed to report a utilization ratio, which is plotted in Figure~\ref{fig:swift-performance}, case A. 
-\katznote{what knowledge should I gain from the figure? is the data good or bad?  why?}
+core-time consumed to report a utilization ratio, which is plotted. We observe that for tasks of only 5 seconds in duration, Swift can sustain 100 concurrent application executions at a CPU utilization of 90\%, and 200 concurrent executions at a utilization of 85\%.
 
 Second, we measured the ability of Swift to support many tasks on a
 large, distributed memory system without considering the effect on the
@@ -1365,8 +1364,9 @@
 concurrent job, thus, the user task had 4 cores at its disposal.  The
 total execution time was measured.  This was compared to the total
 amount of node-time consumed to report a utilization ratio, which is
-plotted in Figure~\ref{fig:swift-performance}, case B.
-\katznote{what knowledge should I gain from the figure? is the data good or bad?  why?}
+plotted. We observe that for tasks of 100 second duration, Swift achieves
+a 95\% CPU utilization of 2,048 compute nodes. Even for 30 second tasks,
+it can sustain an 80\% utilization at this level of concurrency.
 
 
 Third, we measured the ability of Swift to support many tasks on a
@@ -1383,9 +1383,10 @@
 by an equivalent shell script-based application to report an
 efficiency ratio, which is plotted in Figure~\ref{fig:swift-performance}, case C.
 \katznote{what knowledge should I gain from the figure? is the data good or bad?  why?}
+\emph{Note: this test will be refined with adequate data points before publication.}
 
 The Test C shell script was provided with all job specifications in
-advance and did not require communication from between the worker
+advance and did not require communication between the worker
 nodes and the Swift/Coasters runtime.  Thus, this test measures the
 overhead involved in the dynamic job creation and scheduling
 functionality offered by Swift.
@@ -1420,11 +1421,9 @@
 \end{figure}
 
 
-\subsection{Prior performance measures}
-\mikenote{Remove above caption}
+\subsection{Application performance measurements}
 
-Published measurements of Swift performance
-provide evidence that its parallel distributed programming model can be implemented with sufficient scalability and efficiency to make it a practical tool for large-scale parallel application scripting.
+Previously published measurements of Swift performance performance on several scientific applications provide evidence that its parallel distributed programming model can be implemented with sufficient scalability and efficiency to make it a practical tool for large-scale parallel application scripting.
 
 The performance of Swift submitting jobs over the wide area network from UChicago to the TeraGrid Ranger cluster at TACC are shown in Figure~\ref{SEMplots} (from \cite{CNARI_2009}), which shows an SEM workload of 131,072 jobs for 4 brain regions and two experimental conditions. This workflow completed in approximately 3 hours.  The logs from the {\tt swift\_plot\_log} utility show the high degree of concurrent overlap between job execution and input and output file staging to remote computing resources. 
 The workflows were developed on and submitted (to Ranger) from a single-core Linux workstation at UChicago running an Intel¨ Xeonª 3.20 GHz CPU. Data staging was performed using the Globus GridFTP protocol and job execution was performed over the Globus GRAM 2 protocol.
@@ -1435,19 +1434,19 @@
   \begin{center}
   {\footnotesize
     \begin{tabular}{p{14 cm}}
-      \includegraphics[scale=.4]{plots/SEM_left}
+      \includegraphics[scale=.4]{plots/SEM_IO}
     \end{tabular}
     }
-    \caption{128K-job SEM fMRI application execution on the Ranger Constellation. Red=active compute jobs, blue=data stage in, green=stage out.}
+    \caption{128K-job SEM fMRI application execution on the Ranger Constellation (From publication \cite{CNARI_2009}). Red=active compute jobs, blue=data stage in, green=stage out. }
     \label{SEMplots}
   \end{center}
 \end{figure}
 
 Prior work also showed Swift's ability to achieve ample task rates for local and remote submission to high performance clusters. These prior results are shown in Figure~\ref{TaskPlots} (from~\cite{PetascaleScripting_2009}).
 
-Figure~\ref{TaskPlots} left shows the PTMap application running the stage 1 processing of the E.coli K12 genome (4,127 sequences) on 2,048 Intrepid cores. The lower plot shows processor utilization as time progresses; Overall, the average per task execution time was 64 seconds, with a standard deviation of 14 seconds. These 4,127 tasks consumed a total of 73 CPU hours, in a span of 161 seconds on 2,048 processor cores, achieving 80 percent utilization.
+The left plot in figure \ref{TaskPlots} shows the PTMap application running  the stage 1 processing of the E.coli K12 genome (4,127 sequences) on 2,048 Intrepid cores. The lower plot shows processor utilization as time progresses; Overall, the average per task execution time was 64 seconds, with a standard deviation of 14 seconds. These 4,127 tasks consumed a total of 73 CPU hours, in a span of 161 seconds on 2,048 processor cores, achieving 80 percent utilization.
 
-Figure~\ref{TaskPlots} right shows performance of Swift running structural equation modeling problem at large scale using on the Ranger Constellation to model neural pathway connectivity from experimental fMRI data\cite{CNARI_2009}. The left \katznote{lower?} figure shows the active jobs for a larger version of the problem type shown in Figure~\ref{SEMplots}.  This shows an  SEM script executing ~ 418,000 jobs. The red line represents job execution on Ranger; 
+The right plot in figure \ref{TaskPlots} shows performance of Swift running structural equation modeling problem at large scale using on the Ranger Constellation to model neural pathway connectivity from experimental fMRI data\cite{CNARI_2009}. The left figure shows the active jobs for a larger version of the problem type shown in figure \ref{SEMplots}.  This shows a Swift script executing 418,000 structural equation modeling jobs over a 40 hour period.
 
 \begin{figure}
   \begin{center}
@@ -1461,7 +1460,7 @@
       B. SEM application on varying-size processing allocations on Ranger\\
     \end{tabular}
     }
-    \caption{Swift task rates for PTMap and SEM applications on the Blue Gene/P and Ranger}
+    \caption{Swift task rates for PTMap and SEM applications on the Blue Gene/P and Ranger. (From reference \cite{PetascaleScripting_2009} ) }
     \label{TaskPlots}
   \end{center}
 \end{figure}