[Swift-commit] r2692 - provenancedb
noreply at svn.ci.uchicago.edu
noreply at svn.ci.uchicago.edu
Mon Mar 16 08:20:34 CDT 2009
Author: benc
Date: 2009-03-16 08:20:33 -0500 (Mon, 16 Mar 2009)
New Revision: 2692
Modified:
provenancedb/provenance.xml
provenancedb/swift-about-dataset
provenancedb/swift-about-execution
provenancedb/swift-about-filename
Log:
Make the existing swift-about-* commands which are hardcoded to postgres
use configuration file to pick their database
Modified: provenancedb/provenance.xml
===================================================================
--- provenancedb/provenance.xml 2009-03-16 13:19:45 UTC (rev 2691)
+++ provenancedb/provenance.xml 2009-03-16 13:20:33 UTC (rev 2692)
@@ -51,69 +51,6 @@
</screen>
</para>
-<para>There are several swift-about- commands:
-</para>
-<para>swift-about-filename - returns the global dataset IDs for the specified
-filename. Several runs may have output the same filename; the provenance
-database cannot tell which run (if any) any file with that name that
-exists now came from.
-</para>
-<para>Example: this looks for information about
-<filename>001-echo.out</filename> which is the output of the first
-test in the language-behaviour test suite:
-<screen>
-$ <userinput>./swift-about-filename 001-echo.out</userinput>
-Dataset IDs for files that have name file://localhost/001-echo.out
- tag:benc at ci.uchicago.edu,2008:swift:dataset:20080114-1353-g1y3moc0:720000000001
- tag:benc at ci.uchicago.edu,2008:swift:dataset:20080107-1440-67vursv4:720000000001
- tag:benc at ci.uchicago.edu,2008:swift:dataset:20080107-2146-ja2r2z5f:720000000001
- tag:benc at ci.uchicago.edu,2008:swift:dataset:20080107-1608-itdd69l6:720000000001
- tag:benc at ci.uchicago.edu,2008:swift:dataset:20080303-1011-krz4g2y0:720000000001
- tag:benc at ci.uchicago.edu,2008:swift:dataset:20080303-1100-4in9a325:720000000001
-</screen>
-Six different datasets in the provenance database have had that filename
-(because six language behaviour test runs have been uploaded to the
-database).
-</para>
-
-<para>swift-about-dataset - returns information about a dataset, given
-that dataset's uri. Returned information includes the IDs of a containing
-dataset, datasets contained within this dataset, and IDs for executions
-that used this dataset as input or output.
-</para>
-<para>Example:
-<screen>
-$ <userinput>./swift-about-dataset tag:benc at ci.uchicago.edu,2008:swift:dataset:20080114-1353-g1y3moc0:720000000001</userinput>
-About dataset tag:benc at ci.uchicago.edu,2008:swift:dataset:20080114-1353-g1y3moc0:720000000001
-That dataset has these filename(s):
- file://localhost/001-echo.out
-
-That dataset is part of these datasets:
-
-That dataset contains these datasets:
-
-That dataset was input to the following executions (as the specified named parameter):
-
-That dataset was output from the following executions (as the specified return parameter):
- tag:benc at ci.uchicago.edu,2008:swiftlogs:execute:001-echo-20080114-1353-n7puv429:0 | t
-</screen>
-This shows that this dataset is not part of a more complicated dataset
-structure, and was produced as an output parameter t from an execution.
-</para>
-<para>swift-about-execution - gives information about an execution, given
-an execution ID
-<screen>
-$ <userinput>./swift-about-execution tag:benc at ci.uchicago.edu,2008:swiftlogs:execute:001-echo-20080114-1353-n7puv429:0</userinput>
-About execution tag:benc at ci.uchicago.edu,2008:swiftlogs:execute:001-echo-20080114-1353-n7puv429:0
- id | starttime | duration | finalstate | app | scratch
-----------------------------------------------------------------------------------------------------------------------------------+----------------+-------------------+----------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------
- tag:benc at ci.uchicago.edu,2008:swiftlogs:execute:001-echo-20080114-1353-n7puv429:0 | 1200318839.393 | 0.743000030517578 | 0 | END_SUCCESS | echo
-(1 row)
-</screen>
-This shows some basic information about the execution - the start time,
-the duration, the name of the application, the final status.
-</para>
-
</section>
<section><title>Direct database access on terminable</title>
@@ -189,6 +126,10 @@
</section>
<section><title>Querying the newly generated database</title>
<para>
+You can use <command>swift-about-*</command> commands, described in
+the <link linkend="commands">commands section</link>.
+</para>
+<para>
If you're using the SQLite database, you can get an interactive SQL
session to query your new provenance database like this:
<screen>
@@ -204,6 +145,73 @@
</section>
+<section id="commands"><title>swift-about-* commands</title>
+<para>There are several swift-about- commands:
+</para>
+<para>swift-about-filename - returns the global dataset IDs for the specified
+filename. Several runs may have output the same filename; the provenance
+database cannot tell which run (if any) any file with that name that
+exists now came from.
+</para>
+<para>Example: this looks for information about
+<filename>001-echo.out</filename> which is the output of the first
+test in the language-behaviour test suite:
+<screen>
+$ <userinput>./swift-about-filename 001-echo.out</userinput>
+Dataset IDs for files that have name file://localhost/001-echo.out
+ tag:benc at ci.uchicago.edu,2008:swift:dataset:20080114-1353-g1y3moc0:720000000001
+ tag:benc at ci.uchicago.edu,2008:swift:dataset:20080107-1440-67vursv4:720000000001
+ tag:benc at ci.uchicago.edu,2008:swift:dataset:20080107-2146-ja2r2z5f:720000000001
+ tag:benc at ci.uchicago.edu,2008:swift:dataset:20080107-1608-itdd69l6:720000000001
+ tag:benc at ci.uchicago.edu,2008:swift:dataset:20080303-1011-krz4g2y0:720000000001
+ tag:benc at ci.uchicago.edu,2008:swift:dataset:20080303-1100-4in9a325:720000000001
+</screen>
+Six different datasets in the provenance database have had that filename
+(because six language behaviour test runs have been uploaded to the
+database).
+</para>
+
+<para>swift-about-dataset - returns information about a dataset, given
+that dataset's uri. Returned information includes the IDs of a containing
+dataset, datasets contained within this dataset, and IDs for executions
+that used this dataset as input or output.
+</para>
+<para>Example:
+<screen>
+$ <userinput>./swift-about-dataset tag:benc at ci.uchicago.edu,2008:swift:dataset:20080114-1353-g1y3moc0:720000000001</userinput>
+About dataset tag:benc at ci.uchicago.edu,2008:swift:dataset:20080114-1353-g1y3moc0:720000000001
+That dataset has these filename(s):
+ file://localhost/001-echo.out
+
+That dataset is part of these datasets:
+
+That dataset contains these datasets:
+
+That dataset was input to the following executions (as the specified named parameter):
+
+That dataset was output from the following executions (as the specified return parameter):
+ tag:benc at ci.uchicago.edu,2008:swiftlogs:execute:001-echo-20080114-1353-n7puv429:0 | t
+</screen>
+This shows that this dataset is not part of a more complicated dataset
+structure, and was produced as an output parameter t from an execution.
+</para>
+<para>swift-about-execution - gives information about an execution, given
+an execution ID
+<screen>
+$ <userinput>./swift-about-execution tag:benc at ci.uchicago.edu,2008:swiftlogs:execute:001-echo-20080114-1353-n7puv429:0</userinput>
+About execution tag:benc at ci.uchicago.edu,2008:swiftlogs:execute:001-echo-20080114-1353-n7puv429:0
+ id | starttime | duration | finalstate | app | scratch
+----------------------------------------------------------------------------------------------------------------------------------+----------------+-------------------+----------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------
+ tag:benc at ci.uchicago.edu,2008:swiftlogs:execute:001-echo-20080114-1353-n7puv429:0 | 1200318839.393 | 0.743000030517578 | 0 | END_SUCCESS | echo
+(1 row)
+</screen>
+This shows some basic information about the execution - the start time,
+the duration, the name of the application, the final status.
+</para>
+
+</section>
+
+
<section><title>What this work does not address</title>
<para>This work explicitly excludes a number of uses which traditionally
Modified: provenancedb/swift-about-dataset
===================================================================
--- provenancedb/swift-about-dataset 2009-03-16 13:19:45 UTC (rev 2691)
+++ provenancedb/swift-about-dataset 2009-03-16 13:20:33 UTC (rev 2692)
@@ -1,5 +1,8 @@
#!/bin/bash
+PROVDIR=$(pwd)/$(dirname $0)/
+source $PROVDIR/etc/provenance.config
+
ID=$1
#tag:benc at ci.uchicago.edu,2008:swift:dataset:20080114-1353-g1y3moc0:720000000001
@@ -7,18 +10,21 @@
echo About dataset $ID
echo "That dataset has these filename(s):"
-psql -p 5435 -d provdb -U benc --tuples-only -c "select filename from dataset_filenames where dataset_id='$ID';"
+echo "select filename from dataset_filenames where dataset_id='$ID';" | $SQLCMD
+echo
echo "That dataset is part of these datasets:"
-psql -p 5435 -d provdb -U benc --tuples-only -c "select outer_dataset_id from dataset_containment where inner_dataset_id='$ID';"
+echo "select outer_dataset_id from dataset_containment where inner_dataset_id='$ID';" | $SQLCMD
+echo
echo "That dataset contains these datasets:"
-psql -p 5435 -d provdb -U benc --tuples-only -c "select inner_dataset_id from dataset_containment where outer_dataset_id='$ID';"
+echo "select inner_dataset_id from dataset_containment where outer_dataset_id='$ID';" | $SQLCMD
echo "That dataset was input to the following executions (as the specified named parameter):"
+echo "select execute_id, param_name from dataset_usage where dataset_id='$ID' and direction='I';" | $SQLCMD
+echo
-psql -p 5435 -d provdb -U benc --tuples-only -c "select execute_id, param_name from dataset_usage where dataset_id='$ID' and direction='I';"
-
echo "That dataset was output from the following executions (as the specified return parameter):"
-psql -p 5435 -d provdb -U benc --tuples-only -c "select execute_id, param_name from dataset_usage where dataset_id='$ID' and direction='O';"
+echo "select execute_id, param_name from dataset_usage where dataset_id='$ID' and direction='O';" | $SQLCMD
+echo
Modified: provenancedb/swift-about-execution
===================================================================
--- provenancedb/swift-about-execution 2009-03-16 13:19:45 UTC (rev 2691)
+++ provenancedb/swift-about-execution 2009-03-16 13:20:33 UTC (rev 2692)
@@ -1,17 +1,22 @@
#!/bin/bash
+PROVDIR=$(pwd)/$(dirname $0)/
+source $PROVDIR/etc/provenance.config
+
EXECUTEID=$1
echo About execution $EXECUTEID
-psql -p 5435 -d provdb -U benc -c "select * from executes where id='$EXECUTEID';"
+echo "select * from executes where id='$EXECUTEID';" | $SQLCMD
+echo
-
echo Name of SwiftScript procedure which invoked this:
-psql -p 5435 -d provdb -U benc --tuples-only -c "select procedure_name from invocation_procedure_names where execute_id='$EXECUTEID';"
+echo "select procedure_name from invocation_procedure_names where execute_id='$EXECUTEID';" | $SQLCMD
+echo
echo Input datasets:
-psql -p 5435 -d provdb -U benc -c "select dataset_id,param_name from dataset_usage where execute_id='$EXECUTEID' and direction='I';"
+echo "select dataset_id,param_name from dataset_usage where execute_id='$EXECUTEID' and direction='I';" | $SQLCMD
+echo
echo Output datasets:
-psql -p 5435 -d provdb -U benc -c "select dataset_id,param_name from dataset_usage where execute_id='$EXECUTEID' and direction='O';"
+echo "select dataset_id,param_name from dataset_usage where execute_id='$EXECUTEID' and direction='O';" | $SQLCMD
Modified: provenancedb/swift-about-filename
===================================================================
--- provenancedb/swift-about-filename 2009-03-16 13:19:45 UTC (rev 2691)
+++ provenancedb/swift-about-filename 2009-03-16 13:20:33 UTC (rev 2692)
@@ -1,10 +1,17 @@
#!/bin/bash
+PROVDIR=$(pwd)/$(dirname $0)/
+source $PROVDIR/etc/provenance.config
+
FILENAME="file://localhost/$1"
echo Dataset IDs for files that have name $FILENAME
-psql -p 5435 -d provdb -U benc --tuples-only -c "select dataset_id from dataset_filenames where filename='$FILENAME';"
+# when used with postgres, --tuples-only is useful here; but it doesn't
+# generalise
+echo "select dataset_id from dataset_filenames where filename='$FILENAME';" | $SQLCMD
+
+
More information about the Swift-commit
mailing list