[Swift-commit] r7522 - provenancedb

lgadelha at ci.uchicago.edu lgadelha at ci.uchicago.edu
Tue Jan 28 12:17:02 CST 2014


Author: lgadelha
Date: 2014-01-28 12:17:02 -0600 (Tue, 28 Jan 2014)
New Revision: 7522

Added:
   provenancedb/walkthrough.asciidoc
Modified:
   provenancedb/prov-init.sql
   provenancedb/prov-to-sql.sh
Log:
Minor fixes
Provenance DB tutorial


Modified: provenancedb/prov-init.sql
===================================================================
--- provenancedb/prov-init.sql	2014-01-27 19:30:57 UTC (rev 7521)
+++ provenancedb/prov-init.sql	2014-01-28 18:17:02 UTC (rev 7522)
@@ -120,7 +120,7 @@
 
 create view function_call as 
     select fun_call.id, fun_call.name, fun_call.type, app_fun_call.name as app_catalog_name, fun_call.run_id as script_run_id,  
-           to_timestamp(app_fun_call.start_time) as start_time, app_fun_call.duration, app_fun_call.final_state, app_fun_call.scratch
+           to_timestamp(app_fun_call.start_time) as start_time, app_fun_call.duration, app_fun_call.final_state
     from
       fun_call
     left outer join

Modified: provenancedb/prov-to-sql.sh
===================================================================
--- provenancedb/prov-to-sql.sh	2014-01-27 19:30:57 UTC (rev 7521)
+++ provenancedb/prov-to-sql.sh	2014-01-28 18:17:02 UTC (rev 7522)
@@ -170,7 +170,7 @@
 
 echo "    - Function call names."
 while read thread appname; do
-    fid=$(echo $thread | sed "s/run.../&-$CKSUM/g")
+    fid=$(echo $thread)
     echo  "UPDATE fun_call SET name='$appname' WHERE id='$fid';"  >> /tmp/$RUNID.sql
 done < invocation-procedure-names.txt
 

Added: provenancedb/walkthrough.asciidoc
===================================================================
--- provenancedb/walkthrough.asciidoc	                        (rev 0)
+++ provenancedb/walkthrough.asciidoc	2014-01-28 18:17:02 UTC (rev 7522)
@@ -0,0 +1,114 @@
+= Demonstration of Swift's Provenance Database =
+
+Swift's Provenance Database is a set of scripts, SQL functions and stored procedures, and a query interface. It extracts provenance information from Swift's log files into a relational database. The tools are downloadable through SVN with the command:
+
+--------------------------------------
+svn co https://svn.ci.uchicago.edu/svn/vdl2/provenancedb
+--------------------------------------
+
+== Database Configuration
+
+Swift Provenance Database depends on PostgreSQL, version 9.0 or later, due to the use of _Common Table Expressions_ for computing transitive closures of data derivation relationships, supported only on these versions. The file +prov-init.sql+ contains the database schema, and the file +pql_functions.sql+ contain the function and stored procedure definitions. If the user has not created a provenance database yet, this can be done with the following commands (one may need to add "+-U+ _username_" and "+-h+ _hostname_" before the database name "+provdb+", depending on the database server configuration):
+
+--------------------------------------
+createdb provdb
+psql -f prov-init.sql provdb
+psql -f pql-functions.sql provdb
+--------------------------------------
+
+== Swift Provenance Database Configuration
+
+The file +etc/provenance.config+ should be edited to define the database configuration. The location of the directory containing the log files should be defined in the variable +LOGREPO+. For instance:
+
+--------------------------------------
+export LOGREPO=~/swift-logs/
+--------------------------------------
+
+The command used for connecting to the database should be defined in the variable SQLCMD. For example, to connect to CI's PostgreSQL? database:
+
+--------------------------------------
+export SQLCMD="psql -h db.ci.uchicago.edu -U provdb provdb"
+--------------------------------------
+
+The script +./swift-prov-import-all-logs+ will import provenance information from the log files in +$LOGREPO+ into the database. One can use +./swift-prov-import-all-logs rebuild+ to reinitialize database before importing provenance information. 
+
+== Swift Configuration
+
+To enable the generation of provenance information in Swift's log files and to trasfer wrapper logs back to the submitting machine for runtime behavior information extraction the options +provenance.log+ and wrapperlog.always.transfer=true should be set to true in +etc/swift.properties+:
+
+--------------------------------------
+provenance.log=true
+wrapperlog.always.transfer=true
+--------------------------------------
+
+If Swift's SVN revision is 3417 or greater, the following options should be set in +etc/log4j.properties+:
+
+--------------------------------------
+log4j.logger.swift=DEBUG
+log4j.logger.org.griphyn.vdl.karajan.lib=DEBUG
+--------------------------------------
+
+== Demonstration: Image Rendering Workflow
+
+The workflow creates a number of scene files that are rendered using the +c-ray+ application. The images generated are given as input to the +convert+ application to generate a movie.  
+
+The Swift scripts can be downloaded from:
+
+--------------------------------------
+http://www.lncc.br/~lgadelha/c-ray-swift.tgz
+--------------------------------------
+
+The image rendering application source code can be downloaded from:
+
+--------------------------------------
+http://www.futuretech.blinkenlights.nl/depot/c-ray-1.1.tar.gz
+--------------------------------------
+
+The workflow also requires ImageMagick with ffmpeg support. It accepts the following arguments:
+
+- +resolution+: the image resolution in pixels, default is 800x600, 
+- +threads+: the number of execution threads of the image rendering application (c-ray), default is 1,  
+- +steps+: number of frames to be rendered, default is 10,
+- +delay+: delay between video frames in hundreths of a second, default is 0,
+- +quality+: quality of mpeg4 video, default is 75.
+
+To run the workflow one can use for instance:
+
+--------------------------------------
+$ swift -sites.file sites.xml -tc.file tc.data c-ray.swift -resolution=1366x768 -threads=1 -steps=30 -quality=95 -delay=50
+--------------------------------------
+
+To import provenance into the database and connect to the database one can use:
+
+--------------------------------------
+$ swift-prov-import-all-logs
+$ psql provdb
+--------------------------------------
+
+To know which script runs are recorded in the database one can query the +script_run_summary+ view:
+
+--------------------------------------
+provdb=> select * from script_run_summary;
+
+           id            | swift_version | cog_version | final_state |         start_time         | duration | script_filename 
+-------------------------+---------------+-------------+-------------+----------------------------+----------+-----------------
+ c-ray-run000-2046746221 | 7447          | 3852        | SUCCESS     | 2014-01-28 10:28:47.692-06 |  348.127 | c-ray.swift
+--------------------------------------
+
+The +dataset_all+ view stores information about all datasets manipulated during a script run:
+
+--------------------------------------
+provdb=> select * from dataset_all;
+
+                 dataset_id                  | dataset_type |  dataset_value  |       dataset_filename       
+---------------------------------------------+--------------+-----------------+------------------------------
+ dataset:20140128-1028-66i4t8x1:720000000026 | mapped       |                 | file://localhost/tscene
+ dataset:20140128-1028-66i4t8x1:720000000041 | mapped       |                 | file://localhost/output.mpeg
+ dataset:20140128-1028-66i4t8x1:720000000001 | primitive    | 1               | 
+ dataset:20140128-1028-66i4t8x1:720000000002 | primitive    | .ppm            | 
+ dataset:20140128-1028-66i4t8x1:720000000003 | primitive    | 800x600         | 
+ dataset:20140128-1028-66i4t8x1:720000000004 | primitive    | resolution      | 
+ dataset:20140128-1028-66i4t8x1:720000000005 | primitive    | steps           | 
+ ...
+--------------------------------------
+ 




More information about the Swift-commit mailing list