[Swift-commit] r3927 - in text/parco10submission: . code

noreply at svn.ci.uchicago.edu noreply at svn.ci.uchicago.edu
Sun Jan 9 19:34:47 CST 2011


Author: wilde
Date: 2011-01-09 19:34:46 -0600 (Sun, 09 Jan 2011)
New Revision: 3927

Modified:
   text/parco10submission/code/modis.swift
   text/parco10submission/paper.tex
Log:
Addressed Dan's comments on the MODIS code, and corrected other typos in that section.

Modified: text/parco10submission/code/modis.swift
===================================================================
--- text/parco10submission/code/modis.swift	2011-01-10 01:11:44 UTC (rev 3926)
+++ text/parco10submission/code/modis.swift	2011-01-10 01:34:46 UTC (rev 3927)
@@ -58,13 +58,13 @@
 
 # Find the top N tiles (by total area of selected landuse types)
 
-file topSelected<"topselected.txt">;
-file selectedTiles<"selectedtiles.txt">;
+file topSelected <"topselected.txt">;
+file selectedTiles <"selectedtiles.txt">;
 (topSelected, selectedTiles) = analyzeLandUse(land, landType, nSelect);
 
 # Mark the top N tiles on a sinusoidal gridded map
 
-image gridMap<"markedGrid.gif">;
+image gridMap <"markedGrid.gif">;
 gridMap = markMap(topSelected);
 
 # Create multi-color images for all tiles
@@ -80,5 +80,4 @@
 # Assemble a montage of the top selected areas
 
 image montage <single_file_mapper; file=@strcat(runID,"/","map.png") >; # @arg
-montage = assemble(selectedTiles,colorImage,webDir);
-
+montage = assemble(selectedTiles,colorImage,webDir);
\ No newline at end of file

Modified: text/parco10submission/paper.tex
===================================================================
--- text/parco10submission/paper.tex	2011-01-10 01:11:44 UTC (rev 3926)
+++ text/parco10submission/paper.tex	2011-01-10 01:34:46 UTC (rev 3927)
@@ -1235,16 +1235,15 @@
 color images of those closest-matching data tiles.
 (A color rendering step is required to do this, as the input datasets are not viewable images; their pixel
 values are land-use codes.) A typical invocation of this script would be ``\emph{find the top 12 urban tiles}'' or ``\emph{find the 16 tiles with the most forest and grassland}''. As this script is used for tutorial purposes, the application programs it calls are simple shell scripts that use fast, generic image processing applications to process the MODIS data. Thus the example executes quickly while serving as a realistic tutorial script for much more compute-intensive satellite data processing applications.
-\\
-\\
+
 The script is structured as follows:
-Lines 1-3 define 3 mapped file types -- {\tt  MODISfile} for the input images, {\tt landuse} for the output of the landuse histogram calculation; and {\tt file} for any other generic file that we don't care to assign a unique type to.
+Lines 1-3 define 3 mapped file types -- {\tt  MODISfile} for the input images, {\tt landuse} for the output of the landuse histogram calculation; and {\tt file} for any other generic file that we don't wish to assign a unique type to.
 
 Lines 7-32 define the Swift interface functions for the application programs {\tt getLandUse}, {\tt analyzeLandUse}, {\tt colorMODIS}, {\tt assemble}, and {\tt markMap}.
 
-Lines 36-41 extract a set of science parameters from the {\tt swift} command line with which the user invokes the script.
-These indicate the number of files of the input set to select (to enable processing the first M of N files), the set of land cover types to select, the number of ``top'' tiles to select, and parameters used to locate input and output directories.
-\katznote{not sure it these syntaxes were explained in section 2 clearly - if not, they probably should be added to section 2}
+Lines 36-41 uses the built-in function {\tt @arg()} to extract a set of science parameters from the {\tt swift} command line arguments with which the user invokes the script. (This is a keyword-based analog of C's {\tt argv[]} convention).
+These parameters indicate the number of files of the input set to select (to enable processing the first M of N files), the set of land cover types to select, the number of ``top'' tiles to select, and parameters used to locate input and output directories.
+%\katznote{DONE: not sure it these syntaxes were explained in section 2 clearly - if not, they probably should be added to section 2}
 
 Lines 47-48 invoke a ``external'' mapper script {\tt modis.mapper} to map the first {\tt nFiles} MODIS data files in the directory contained in the script argument {\tt MODISdir} to the array {\tt geos}. An external mapper script is written by the Swift programmer (in any language desired, but quite often mappers are simple shell scripts). External mappers are usually co-located with the Swift script, and are invoked when Swift instantiates the associated variable. They return a two-field list of the the form \emph{SwiftExpression, filename}, where \emph{SwiftExpression} is relative to the variable name being mapped.  For example, if this mapper invocation were called from the Swift script at line 47-48:
 \begin{Verbatim}[fontsize=\scriptsize,framesep=2mm]
@@ -1259,13 +1258,17 @@
 
 At lines 52-53, the script declares the array {\tt land} which will contain the output of the {\tt getlanduse} application. This declaration uses the built-in ``structured regular expression mapper'', which will determine the names of the \emph{output} files that the array will refer to once they are computed. Swift knows from context that this is an output mapping. The mapper will use regular expressions to base the names of the output files on the filenames of the corresponding elements of the input array {\tt geos} given by the {\tt source=} argument to the mapper. The declaration for {\tt land[]} maps, for example, a file {\tt h07v08.landuse.byfreq} to an element of the {\tt land[]} array for a file {\tt h07v08.tif} in the {\tt geos[]} array.
 
-At lines 55-57 the script performs its first computation using a {\tt foreach} loop to invoke {\tt getLandUse} in parallel on each file mapped to the elements of {\tt geos[]}. As 317 files were mapped (in lines 47-48), the loop will invoke 317 instances of the application in parallel. \katznote{is this strictly true?  Do you want to say that it will enable 317 instances to be runnable in parallel, but the number that are actually run in parallel depends on the hardware available to Swift, or something like that?} The result of each computation is placed in a file mapped to the array {\tt land} and named by the regular expression translation to be based on the file names mapped to the array {\tt geos[]} (in lines \katznote{is this 52-53?}). Thus the landuse histogram for file {\tt /home/wilde/modis/2002/h00v08.tif} would be written into file {\tt h00v08.landuse.freq} and would be considered by Swift to be of type {\tt landuse}.
+At lines 55-57 the script performs its first computation using a {\tt foreach} loop to invoke {\tt getLandUse} in parallel on each file mapped to the elements of {\tt geos[]}. As 317 files were mapped (in lines 47-48), the loop will submit 317 instances of the application in parallel to the underlying execution provider. These will execute with a degree of parallelism subject to available resources. 
+%\katznote{DONE: is this strictly true?  Do you want to say that it will enable 317 instances to be runnable in parallel, but the number that are actually run in parallel depends on the hardware available to Swift, or something like that?}
+At lines 52-53 the result of each computation is placed in a file mapped to the array {\tt land} and named by the regular expression translation to be based on the file names mapped to the array {\tt geos[]} .
+Thus the landuse histogram for file {\tt /home/wilde/modis/2002/h00v08.tif} would be written into file {\tt h00v08.landuse.freq} and would be considered by Swift to be of type {\tt landuse}.
 
-Once all the land usage histograms have have been computed, the script can then execute {\tt analyzeLandUse} at line 63 to find the requested number of highest tiles (files) with a specific land cover combination. The Swift runtime system uses futures to ensure that this analysis function is not invoked until all of its input files have computed and transported to the computation site chosen to run the analysis program. All of these steps take place automatically, using the relatively simple and location-independent Swift expressions shown. The output files to be use to hold the result are specified in the declarations at lines 61-62. \katznote{should these lines have a space inserted before the ``<'' to match the previous lines?  Same question for 67-68... }
+Once all the land usage histograms have have been computed, the script can then execute {\tt analyzeLandUse} at line 63 to find the requested number of highest tiles (files) with a specific land cover combination. The Swift runtime system uses futures to ensure that this analysis function is not invoked until all of its input files have computed and transported to the computation site chosen to run the analysis program. All of these steps take place automatically, using the relatively simple and location-independent Swift expressions shown. The output files to be used for the result are specified in the declarations at lines 61-62.
+% \katznote{DONE: should these lines have a space inserted before the ``<'' to match the previous lines?  Same question for 67-68... }
 
-To visualize the results, the application function {\tt markMap} invoked at line 68 will generate an image of a world map using the MODIS projection system and indicate the selected tiles matching the analysis criteria. Since this statememt depends on the output of the analysis ({\tt topSelected}), it will wait for statement at line 63 to complete before commencing.
+To visualize the results, the application function {\tt markMap} invoked at line 68 will generate an image of a world map using the MODIS projection system and indicate the selected tiles matching the analysis criteria. Since this statement depends on the output of the analysis ({\tt topSelected}), it will wait for statement at line 63 to complete before commencing.
 
-For additional visualization, the script assembles a full map of all the input tiles, placed in their proper grid location on the MODIS world map projection, and again marking the selected tiles. Since this operation needs true-color images of every input tiles these are computed---again in \katznote{potentially? as before} parallel---with 317 jobs invoked by the foreach statement at line 76-78. The power of Swift's implicit parallelization is shown vividly here: since the {\tt colorMODIS} call at line 77 depends only on the input array {\tt geos}, these 317 application invocations are executed in parallel with the initial 317 parallel executions of the {\tt getLandUse} application at line 56.  The script concludes at line 83 by assembling a montage of all the colored tiles and writing this image file to a web-accessible directory for viewing.
+For additional visualization, the script assembles a full map of all the input tiles, placed in their proper grid location on the MODIS world map projection, and again marking the selected tiles. Since this operation needs true-color images of every input tiles these are computed--again in parallel--with 317 jobs generated by the foreach statement at line 76-78. The power of Swift's implicit parallelization is shown vividly here: since the {\tt colorMODIS} call at line 77 depends only on the input array {\tt geos}, these 317 application invocations are submitted in parallel with the initial 317 parallel executions of the {\tt getLandUse} application at line 56.  The script concludes at line 83 by assembling a montage of all the colored tiles and writing this image file to a web-accessible directory for viewing.
 
 \pagebreak
 {\bf \small Swift example 1: MODIS satellite image processing script}




More information about the Swift-commit mailing list