[Swift-commit] r7522 - provenancedb
lgadelha at ci.uchicago.edu
lgadelha at ci.uchicago.edu
Tue Jan 28 12:17:02 CST 2014
Author: lgadelha
Date: 2014-01-28 12:17:02 -0600 (Tue, 28 Jan 2014)
New Revision: 7522
Added:
provenancedb/walkthrough.asciidoc
Modified:
provenancedb/prov-init.sql
provenancedb/prov-to-sql.sh
Log:
Minor fixes
Provenance DB tutorial
Modified: provenancedb/prov-init.sql
===================================================================
--- provenancedb/prov-init.sql 2014-01-27 19:30:57 UTC (rev 7521)
+++ provenancedb/prov-init.sql 2014-01-28 18:17:02 UTC (rev 7522)
@@ -120,7 +120,7 @@
create view function_call as
select fun_call.id, fun_call.name, fun_call.type, app_fun_call.name as app_catalog_name, fun_call.run_id as script_run_id,
- to_timestamp(app_fun_call.start_time) as start_time, app_fun_call.duration, app_fun_call.final_state, app_fun_call.scratch
+ to_timestamp(app_fun_call.start_time) as start_time, app_fun_call.duration, app_fun_call.final_state
from
fun_call
left outer join
Modified: provenancedb/prov-to-sql.sh
===================================================================
--- provenancedb/prov-to-sql.sh 2014-01-27 19:30:57 UTC (rev 7521)
+++ provenancedb/prov-to-sql.sh 2014-01-28 18:17:02 UTC (rev 7522)
@@ -170,7 +170,7 @@
echo " - Function call names."
while read thread appname; do
- fid=$(echo $thread | sed "s/run.../&-$CKSUM/g")
+ fid=$(echo $thread)
echo "UPDATE fun_call SET name='$appname' WHERE id='$fid';" >> /tmp/$RUNID.sql
done < invocation-procedure-names.txt
Added: provenancedb/walkthrough.asciidoc
===================================================================
--- provenancedb/walkthrough.asciidoc (rev 0)
+++ provenancedb/walkthrough.asciidoc 2014-01-28 18:17:02 UTC (rev 7522)
@@ -0,0 +1,114 @@
+= Demonstration of Swift's Provenance Database =
+
+Swift's Provenance Database is a set of scripts, SQL functions and stored procedures, and a query interface. It extracts provenance information from Swift's log files into a relational database. The tools are downloadable through SVN with the command:
+
+--------------------------------------
+svn co https://svn.ci.uchicago.edu/svn/vdl2/provenancedb
+--------------------------------------
+
+== Database Configuration
+
+Swift Provenance Database depends on PostgreSQL, version 9.0 or later, due to the use of _Common Table Expressions_ for computing transitive closures of data derivation relationships, supported only on these versions. The file +prov-init.sql+ contains the database schema, and the file +pql_functions.sql+ contain the function and stored procedure definitions. If the user has not created a provenance database yet, this can be done with the following commands (one may need to add "+-U+ _username_" and "+-h+ _hostname_" before the database name "+provdb+", depending on the database server configuration):
+
+--------------------------------------
+createdb provdb
+psql -f prov-init.sql provdb
+psql -f pql-functions.sql provdb
+--------------------------------------
+
+== Swift Provenance Database Configuration
+
+The file +etc/provenance.config+ should be edited to define the database configuration. The location of the directory containing the log files should be defined in the variable +LOGREPO+. For instance:
+
+--------------------------------------
+export LOGREPO=~/swift-logs/
+--------------------------------------
+
+The command used for connecting to the database should be defined in the variable SQLCMD. For example, to connect to CI's PostgreSQL? database:
+
+--------------------------------------
+export SQLCMD="psql -h db.ci.uchicago.edu -U provdb provdb"
+--------------------------------------
+
+The script +./swift-prov-import-all-logs+ will import provenance information from the log files in +$LOGREPO+ into the database. One can use +./swift-prov-import-all-logs rebuild+ to reinitialize database before importing provenance information.
+
+== Swift Configuration
+
+To enable the generation of provenance information in Swift's log files and to trasfer wrapper logs back to the submitting machine for runtime behavior information extraction the options +provenance.log+ and wrapperlog.always.transfer=true should be set to true in +etc/swift.properties+:
+
+--------------------------------------
+provenance.log=true
+wrapperlog.always.transfer=true
+--------------------------------------
+
+If Swift's SVN revision is 3417 or greater, the following options should be set in +etc/log4j.properties+:
+
+--------------------------------------
+log4j.logger.swift=DEBUG
+log4j.logger.org.griphyn.vdl.karajan.lib=DEBUG
+--------------------------------------
+
+== Demonstration: Image Rendering Workflow
+
+The workflow creates a number of scene files that are rendered using the +c-ray+ application. The images generated are given as input to the +convert+ application to generate a movie.
+
+The Swift scripts can be downloaded from:
+
+--------------------------------------
+http://www.lncc.br/~lgadelha/c-ray-swift.tgz
+--------------------------------------
+
+The image rendering application source code can be downloaded from:
+
+--------------------------------------
+http://www.futuretech.blinkenlights.nl/depot/c-ray-1.1.tar.gz
+--------------------------------------
+
+The workflow also requires ImageMagick with ffmpeg support. It accepts the following arguments:
+
+- +resolution+: the image resolution in pixels, default is 800x600,
+- +threads+: the number of execution threads of the image rendering application (c-ray), default is 1,
+- +steps+: number of frames to be rendered, default is 10,
+- +delay+: delay between video frames in hundreths of a second, default is 0,
+- +quality+: quality of mpeg4 video, default is 75.
+
+To run the workflow one can use for instance:
+
+--------------------------------------
+$ swift -sites.file sites.xml -tc.file tc.data c-ray.swift -resolution=1366x768 -threads=1 -steps=30 -quality=95 -delay=50
+--------------------------------------
+
+To import provenance into the database and connect to the database one can use:
+
+--------------------------------------
+$ swift-prov-import-all-logs
+$ psql provdb
+--------------------------------------
+
+To know which script runs are recorded in the database one can query the +script_run_summary+ view:
+
+--------------------------------------
+provdb=> select * from script_run_summary;
+
+ id | swift_version | cog_version | final_state | start_time | duration | script_filename
+-------------------------+---------------+-------------+-------------+----------------------------+----------+-----------------
+ c-ray-run000-2046746221 | 7447 | 3852 | SUCCESS | 2014-01-28 10:28:47.692-06 | 348.127 | c-ray.swift
+--------------------------------------
+
+The +dataset_all+ view stores information about all datasets manipulated during a script run:
+
+--------------------------------------
+provdb=> select * from dataset_all;
+
+ dataset_id | dataset_type | dataset_value | dataset_filename
+---------------------------------------------+--------------+-----------------+------------------------------
+ dataset:20140128-1028-66i4t8x1:720000000026 | mapped | | file://localhost/tscene
+ dataset:20140128-1028-66i4t8x1:720000000041 | mapped | | file://localhost/output.mpeg
+ dataset:20140128-1028-66i4t8x1:720000000001 | primitive | 1 |
+ dataset:20140128-1028-66i4t8x1:720000000002 | primitive | .ppm |
+ dataset:20140128-1028-66i4t8x1:720000000003 | primitive | 800x600 |
+ dataset:20140128-1028-66i4t8x1:720000000004 | primitive | resolution |
+ dataset:20140128-1028-66i4t8x1:720000000005 | primitive | steps |
+ ...
+--------------------------------------
+
More information about the Swift-commit
mailing list