[Swift-commit] r3944 - text/parco10submission
noreply at svn.ci.uchicago.edu
noreply at svn.ci.uchicago.edu
Mon Jan 10 13:31:51 CST 2011
Author: dsk
Date: 2011-01-10 13:31:51 -0600 (Mon, 10 Jan 2011)
New Revision: 3944
Modified:
text/parco10submission/paper.tex
Log:
updated
Modified: text/parco10submission/paper.tex
===================================================================
--- text/parco10submission/paper.tex 2011-01-10 19:20:00 UTC (rev 3943)
+++ text/parco10submission/paper.tex 2011-01-10 19:31:51 UTC (rev 3944)
@@ -1345,14 +1345,14 @@
We present here a few additional measurements of Swift performance, and highlight and include a few previously published results.
\subsection{Synthetic benchmark results}
-First, we measure the ability of Swift to support many user tasks on a
-single local system. In Test A, we used Swift to submit up to 2,000
+First, we measured the ability of Swift to support many user tasks on a
+single local system. We used Swift to submit up to 2,000
tasks to a 16-core x86-based Linux compute server at Argonne
National Laboratory. Each job in the batch was an identical, simple
single-processor job that executed for the given duration and
performed application input and output at 1 byte each. The total
execution time was measured. This was compared to the total amount of
-core-time consumed to report a utilization ratio, which is plotted. We observe that for tasks of only 5 seconds in duration, Swift can sustain 100 concurrent application executions at a CPU utilization of 90\%, and 200 concurrent executions at a utilization of 85\%.
+core-time consumed to report a utilization ratio, which is plotted in Figure~\ref{fig:swift-performance}, case A. We observe that for tasks of only 5 seconds in duration, Swift can sustain 100 concurrent application executions at a CPU utilization of 90\%, and 200 concurrent executions at a utilization of 85\%.
Second, we measured the ability of Swift to support many tasks on a
large, distributed memory system without considering the effect on the
@@ -1364,33 +1364,33 @@
concurrent job, thus, the user task had 4 cores at its disposal. The
total execution time was measured. This was compared to the total
amount of node-time consumed to report a utilization ratio, which is
-plotted. We observe that for tasks of 100 second duration, Swift achieves
+plotted in Figure~\ref{fig:swift-performance}, case B.
+We observe that for tasks of 100 second duration, Swift achieves
a 95\% CPU utilization of 2,048 compute nodes. Even for 30 second tasks,
it can sustain an 80\% utilization at this level of concurrency.
-Third, we measured the ability of Swift to support many tasks on a
-large, distributed memory system including application use of the
-underlying GPFS filesystem. We used Swift/Coasters to
-submit up to 10,240 tasks to Intrepid. Each job in the batch was an
-identical, simple single-processor job that executed for 30 seconds
-and performed the given amount of input and output. Coasters provider
-staging was used to distribute application data to workers, except in
-the case marked ``direct'', in which case the I/O was performed
-directly to GPFS. Each node was limited to one concurrent job, thus,
-the user task had 4 cores at its disposal. The total execution time
-was measured. This was compared to the total amount of time consumed
-by an equivalent shell script-based application to report an
-efficiency ratio, which is plotted in Figure~\ref{fig:swift-performance}, case C.
-\katznote{what knowledge should I gain from the figure? is the data good or bad? why?}
-\emph{Note: this test will be refined with adequate data points before publication.}
+%Third, we measured the ability of Swift to support many tasks on a
+%large, distributed memory system including application use of the
+%underlying GPFS filesystem. We used Swift/Coasters to
+%submit up to 10,240 tasks to Intrepid. Each job in the batch was an
+%identical, simple single-processor job that executed for 30 seconds
+%and performed the given amount of input and output. Coasters provider
+%staging was used to distribute application data to workers, except in
+%the case marked ``direct'', in which case the I/O was performed
+%directly to GPFS. Each node was limited to one concurrent job, thus,
+%the user task had 4 cores at its disposal. The total execution time
+%was measured. This was compared to the total amount of time consumed
+%by an equivalent shell script-based application to report an
+%efficiency ratio, which is plotted in Figure~\ref{fig:swift-performance}, case C.
+%\emph{Note: this test will be refined with adequate data points before publication.}
+%
+%The Test C shell script was provided with all job specifications in
+%advance and did not require communication between the worker
+%nodes and the Swift/Coasters runtime. Thus, this test measures the
+%overhead involved in the dynamic job creation and scheduling
+%functionality offered by Swift.
-The Test C shell script was provided with all job specifications in
-advance and did not require communication between the worker
-nodes and the Swift/Coasters runtime. Thus, this test measures the
-overhead involved in the dynamic job creation and scheduling
-functionality offered by Swift.
-
\newcommand{\plotscale}{0.60}
\begin{figure}
@@ -1407,13 +1407,13 @@
Application CPU utilization for 3 task durations
(in seconds) at up to 2,048 nodes of the Blue Gene/P.
at varying system size. \\
- \includegraphics[scale=\plotscale]{plots/dds}
- & \\
- Test C.
- Efficiency for a fixed number of tasks with varying data sizes.
- Input and out data was one file in each direction of the size
- indicated.
- & \\
+% \includegraphics[scale=\plotscale]{plots/dds}
+% & \\
+% Test C.
+% Efficiency for a fixed number of tasks with varying data sizes.
+% Input and out data was one file in each direction of the size
+% indicated.
+% & \\
\end{tabular}
}
\caption{Swift performance figures.\label{fig:swift-performance}}
@@ -1437,16 +1437,16 @@
\includegraphics[scale=.4]{plots/SEM_IO}
\end{tabular}
}
- \caption{128K-job SEM fMRI application execution on the Ranger Constellation (From publication \cite{CNARI_2009}). Red=active compute jobs, blue=data stage in, green=stage out. }
+ \caption{128K-job SEM fMRI application execution on the Ranger Constellation (From \cite{CNARI_2009}). Red=active compute jobs, blue=data stage in, green=stage out. }
\label{SEMplots}
\end{center}
\end{figure}
-Prior work also showed Swift's ability to achieve ample task rates for local and remote submission to high performance clusters. These prior results are shown in Figure~\ref{TaskPlots} (from~\cite{PetascaleScripting_2009}).
+Prior work also showed Swift's ability to achieve ample task rates for local and remote submission to high performance clusters. These prior results are shown in Figure~\ref{TaskPlots} (from \cite{PetascaleScripting_2009}).
-The left plot in figure \ref{TaskPlots} shows the PTMap application running the stage 1 processing of the E.coli K12 genome (4,127 sequences) on 2,048 Intrepid cores. The lower plot shows processor utilization as time progresses; Overall, the average per task execution time was 64 seconds, with a standard deviation of 14 seconds. These 4,127 tasks consumed a total of 73 CPU hours, in a span of 161 seconds on 2,048 processor cores, achieving 80 percent utilization.
+The top plot in Figure~\ref{TaskPlots}-A shows the PTMap application running the stage 1 processing of the E.coli K12 genome (4,127 sequences) on 2,048 Intrepid cores. The lower plot shows processor utilization as time progresses; overall, the average per task execution time was 64 seconds, with a standard deviation of 14 seconds. These 4,127 tasks consumed a total of 73 CPU hours, in a span of 161 seconds on 2,048 processor cores, achieving 80 percent utilization.
-The right plot in figure \ref{TaskPlots} shows performance of Swift running structural equation modeling problem at large scale using on the Ranger Constellation to model neural pathway connectivity from experimental fMRI data\cite{CNARI_2009}. The left figure shows the active jobs for a larger version of the problem type shown in figure \ref{SEMplots}. This shows a Swift script executing 418,000 structural equation modeling jobs over a 40 hour period.
+The top plot in Figure~\ref{TaskPlots}-B shows performance of Swift running structural equation modeling problem at large scale using on the Ranger Constellation to model neural pathway connectivity from experimental fMRI data~\cite{CNARI_2009}. The lower plot shows the active jobs for a larger version of the problem type shown in Figure~\ref{SEMplots}. This shows a Swift script executing 418,000 structural equation modeling jobs over a 40 hour period.
\begin{figure}
\begin{center}
@@ -1460,7 +1460,7 @@
B. SEM application on varying-size processing allocations on Ranger\\
\end{tabular}
}
- \caption{Swift task rates for PTMap and SEM applications on the Blue Gene/P and Ranger. (From reference \cite{PetascaleScripting_2009} ) }
+ \caption{Swift task rates for PTMap and SEM applications on the Blue Gene/P and Ranger. (From \cite{PetascaleScripting_2009}) }
\label{TaskPlots}
\end{center}
\end{figure}
More information about the Swift-commit
mailing list