[Swift-commit] r3944 - text/parco10submission

Mon Jan 10 13:31:51 CST 2011

Author: dsk
Date: 2011-01-10 13:31:51 -0600 (Mon, 10 Jan 2011)
New Revision: 3944

Modified:
   text/parco10submission/paper.tex
Log:
updated


Modified: text/parco10submission/paper.tex
===================================================================

--- text/parco10submission/paper.tex	2011-01-10 19:20:00 UTC (rev 3943)
+++ text/parco10submission/paper.tex	2011-01-10 19:31:51 UTC (rev 3944)
@@ -1345,14 +1345,14 @@
 We present here a few additional measurements of Swift performance, and highlight and include a few previously published results.
 
 \subsection{Synthetic benchmark results}
-First, we measure the ability of Swift to support many user tasks on a
-single local system.  In Test A, we used Swift to submit up to 2,000
+First, we measured the ability of Swift to support many user tasks on a
+single local system.  We used Swift to submit up to 2,000
 tasks to a 16-core x86-based Linux compute server at Argonne
 National Laboratory.  Each job in the batch was an identical, simple
 single-processor job that executed for the given duration and
 performed application input and output at 1 byte each.  The total
 execution time was measured.  This was compared to the total amount of
-core-time consumed to report a utilization ratio, which is plotted. We observe that for tasks of only 5 seconds in duration, Swift can sustain 100 concurrent application executions at a CPU utilization of 90\%, and 200 concurrent executions at a utilization of 85\%.
+core-time consumed to report a utilization ratio, which is plotted in Figure~\ref{fig:swift-performance}, case A. We observe that for tasks of only 5 seconds in duration, Swift can sustain 100 concurrent application executions at a CPU utilization of 90\%, and 200 concurrent executions at a utilization of 85\%.
 
 Second, we measured the ability of Swift to support many tasks on a
 large, distributed memory system without considering the effect on the
@@ -1364,33 +1364,33 @@
 concurrent job, thus, the user task had 4 cores at its disposal.  The
 total execution time was measured.  This was compared to the total
 amount of node-time consumed to report a utilization ratio, which is
-plotted. We observe that for tasks of 100 second duration, Swift achieves
+plotted in Figure~\ref{fig:swift-performance}, case B.
+We observe that for tasks of 100 second duration, Swift achieves
 a 95\% CPU utilization of 2,048 compute nodes. Even for 30 second tasks,
 it can sustain an 80\% utilization at this level of concurrency.
 
 
-Third, we measured the ability of Swift to support many tasks on a
-large, distributed memory system including application use of the
-underlying GPFS filesystem.  We used Swift/Coasters to
-submit up to 10,240 tasks to Intrepid.  Each job in the batch was an
-identical, simple single-processor job that executed for 30 seconds
-and performed the given amount of input and output.  Coasters provider
-staging was used to distribute application data to workers, except in
-the case marked ``direct'', in which case the I/O was performed
-directly to GPFS.  Each node was limited to one concurrent job, thus,
-the user task had 4 cores at its disposal.  The total execution time
-was measured.  This was compared to the total amount of time consumed
-by an equivalent shell script-based application to report an
-efficiency ratio, which is plotted in Figure~\ref{fig:swift-performance}, case C.
-\katznote{what knowledge should I gain from the figure? is the data good or bad?  why?}
-\emph{Note: this test will be refined with adequate data points before publication.}
+%Third, we measured the ability of Swift to support many tasks on a
+%large, distributed memory system including application use of the
+%underlying GPFS filesystem.  We used Swift/Coasters to
+%submit up to 10,240 tasks to Intrepid.  Each job in the batch was an
+%identical, simple single-processor job that executed for 30 seconds
+%and performed the given amount of input and output.  Coasters provider
+%staging was used to distribute application data to workers, except in
+%the case marked ``direct'', in which case the I/O was performed
+%directly to GPFS.  Each node was limited to one concurrent job, thus,
+%the user task had 4 cores at its disposal.  The total execution time
+%was measured.  This was compared to the total amount of time consumed
+%by an equivalent shell script-based application to report an
+%efficiency ratio, which is plotted in Figure~\ref{fig:swift-performance}, case C.
+%\emph{Note: this test will be refined with adequate data points before publication.}
+%
+%The Test C shell script was provided with all job specifications in
+%advance and did not require communication between the worker
+%nodes and the Swift/Coasters runtime.  Thus, this test measures the
+%overhead involved in the dynamic job creation and scheduling
+%functionality offered by Swift.
 
-The Test C shell script was provided with all job specifications in
-advance and did not require communication between the worker
-nodes and the Swift/Coasters runtime.  Thus, this test measures the
-overhead involved in the dynamic job creation and scheduling
-functionality offered by Swift.
-
 \newcommand{\plotscale}{0.60}
 
 \begin{figure}
@@ -1407,13 +1407,13 @@
       Application CPU utilization for 3 task durations
       (in seconds) at up to 2,048 nodes of the Blue Gene/P.
       at varying system size. \\
-      \includegraphics[scale=\plotscale]{plots/dds}
-      & \\
-      Test C.
-      Efficiency for a fixed number of tasks with varying data sizes.
-      Input and out data was one file in each direction of the size
-      indicated.
-      & \\
+%      \includegraphics[scale=\plotscale]{plots/dds}
+%      & \\
+%      Test C.
+%      Efficiency for a fixed number of tasks with varying data sizes.
+%      Input and out data was one file in each direction of the size
+%      indicated.
+%      & \\
     \end{tabular}
   }
     \caption{Swift performance figures.\label{fig:swift-performance}}
@@ -1437,16 +1437,16 @@
       \includegraphics[scale=.4]{plots/SEM_IO}
     \end{tabular}
     }
-    \caption{128K-job SEM fMRI application execution on the Ranger Constellation (From publication \cite{CNARI_2009}). Red=active compute jobs, blue=data stage in, green=stage out. }
+    \caption{128K-job SEM fMRI application execution on the Ranger Constellation (From \cite{CNARI_2009}). Red=active compute jobs, blue=data stage in, green=stage out. }
     \label{SEMplots}
   \end{center}
 \end{figure}
 
-Prior work also showed Swift's ability to achieve ample task rates for local and remote submission to high performance clusters. These prior results are shown in Figure~\ref{TaskPlots} (from~\cite{PetascaleScripting_2009}).
+Prior work also showed Swift's ability to achieve ample task rates for local and remote submission to high performance clusters. These prior results are shown in Figure~\ref{TaskPlots} (from \cite{PetascaleScripting_2009}).
 
-The left plot in figure \ref{TaskPlots} shows the PTMap application running  the stage 1 processing of the E.coli K12 genome (4,127 sequences) on 2,048 Intrepid cores. The lower plot shows processor utilization as time progresses; Overall, the average per task execution time was 64 seconds, with a standard deviation of 14 seconds. These 4,127 tasks consumed a total of 73 CPU hours, in a span of 161 seconds on 2,048 processor cores, achieving 80 percent utilization.
+The top plot in Figure~\ref{TaskPlots}-A shows the PTMap application running  the stage 1 processing of the E.coli K12 genome (4,127 sequences) on 2,048 Intrepid cores. The lower plot shows processor utilization as time progresses; overall, the average per task execution time was 64 seconds, with a standard deviation of 14 seconds. These 4,127 tasks consumed a total of 73 CPU hours, in a span of 161 seconds on 2,048 processor cores, achieving 80 percent utilization.
 
-The right plot in figure \ref{TaskPlots} shows performance of Swift running structural equation modeling problem at large scale using on the Ranger Constellation to model neural pathway connectivity from experimental fMRI data\cite{CNARI_2009}. The left figure shows the active jobs for a larger version of the problem type shown in figure \ref{SEMplots}.  This shows a Swift script executing 418,000 structural equation modeling jobs over a 40 hour period.
+The top plot in Figure~\ref{TaskPlots}-B shows performance of Swift running structural equation modeling problem at large scale using on the Ranger Constellation to model neural pathway connectivity from experimental fMRI data~\cite{CNARI_2009}. The lower plot shows the active jobs for a larger version of the problem type shown in Figure~\ref{SEMplots}.  This shows a Swift script executing 418,000 structural equation modeling jobs over a 40 hour period.
 
 \begin{figure}
   \begin{center}
@@ -1460,7 +1460,7 @@
       B. SEM application on varying-size processing allocations on Ranger\\
     \end{tabular}
     }
-    \caption{Swift task rates for PTMap and SEM applications on the Blue Gene/P and Ranger. (From reference \cite{PetascaleScripting_2009} ) }
+    \caption{Swift task rates for PTMap and SEM applications on the Blue Gene/P and Ranger. (From \cite{PetascaleScripting_2009}) }
     \label{TaskPlots}
   \end{center}
 \end{figure}