[Swift-commit] r2633 - provenancedb

noreply at svn.ci.uchicago.edu noreply at svn.ci.uchicago.edu
Mon Mar 2 11:57:39 CST 2009


Author: benc
Date: 2009-03-02 11:57:38 -0600 (Mon, 02 Mar 2009)
New Revision: 2633

Added:
   provenancedb/drive-opm
   provenancedb/prov-to-opm.sh
Log:
A prototype of a log->OPM XML form converter using a variation of
Luc Moreau's XML schema.

Added: provenancedb/drive-opm
===================================================================
--- provenancedb/drive-opm	                        (rev 0)
+++ provenancedb/drive-opm	2009-03-02 17:57:38 UTC (rev 2633)
@@ -0,0 +1,13 @@
+#!/bin/bash
+
+# driver for log file -> OPM conversion
+
+filename=$1
+
+export PATH=$LOGPATH:$PATH
+
+echo log to OPM conversion for $filename
+
+./prepare-for-import $filename
+./prov-to-opm.sh $filename
+


Property changes on: provenancedb/drive-opm
___________________________________________________________________
Name: svn:executable
   + *

Added: provenancedb/prov-to-opm.sh
===================================================================
--- provenancedb/prov-to-opm.sh	                        (rev 0)
+++ provenancedb/prov-to-opm.sh	2009-03-02 17:57:38 UTC (rev 2633)
@@ -0,0 +1,85 @@
+#!/bin/bash
+
+echo Generating OPM for a single run
+
+rm -f opm.xml
+
+# TODO make swift-opm-ns into a proper URI
+echo "<opmGraph xmlns=\"http://openprovenance.org/model/v1.01.a\" xmlns:swift=\"swift-opm-ns\">" > opm.xml
+
+echo "<accounts><account id=\"base\" /></accounts>" >> opm.xml
+
+echo "<processes>" >> opm.xml
+
+while read time duration thread endstate app scratch; do
+
+echo "  <process id=\"$thread\">"
+echo "    <account id=\"base\" />"
+echo "    <swift:info starttime=\"$starttime\" duration=\"$duration\" endstate=\"$endstate\" app=\"$app\" scratch=\"$scratch\"/>"
+# TODO no value here - this is some URI into an ontology, which is don't
+# really know how should be mapped from Swift
+echo "  </process>"
+
+done < $LOGDIR/execute.global.event >> opm.xml
+
+echo "</processes>" >> opm.xml
+
+# TODO artifacts
+
+echo "<artifacts>" >> opm.xml
+
+# we need a list of all artifacts here. for now, take everything we can
+# find in the tie-data-invocs and containment tables, uniquefied.
+# This is probably the wrong thing to do.
+
+while read outer inner; do
+  echo $input
+  echo $output
+done < $LOGDIR/tie-containers.txt > tmp-dshandles.txt
+
+while read t d dataset rest ; do
+  echo $dataset
+done < $LOGDIR/tie-data-invocs.txt >> tmp-dshandles.txt
+
+cat tmp-dshandles.txt | sort | uniq > tmp-dshandles2.txt
+
+while read artifact ; do
+echo "  <artifact id=\"$artifact\">"
+echo "    <account id=\"base\" />"
+echo "  </artifact>"
+done < tmp-dshandles2.txt >> opm.xml
+
+echo "</artifacts>" >> opm.xml
+
+
+echo "<causalDependencies>" >> opm.xml
+
+# other stuff can do this in any order, but here we must probably do it
+# in two passes, one for each relation, in order to satisfy schema.
+# but for now do it in a single pass...
+
+while read thread direction dataset variable rest; do 
+  if [ "$direction" == "input" ] ; then
+    echo "  <used>"
+    echo "    <effect id=\"$thread\" />"
+    echo "    <role value=\"$variable\" />"
+    echo "    <cause id=\"$dataset\" />"
+    echo "    <account id=\"base\" />"
+    echo "  </used>"
+  else
+    echo "  <wasGeneratedBy>"
+    echo "    <effect id=\"$dataset\" />"
+    echo "    <role value=\"$variable\" />"
+    echo "    <cause id=\"$thread\" />"
+    echo "    <account id=\"base\" />"
+    echo "  </wasGeneratedBy>"
+  fi
+done < $LOGDIR/tie-data-invocs.txt >> opm.xml
+
+
+
+echo "</causalDependencies>" >> opm.xml
+
+echo "</opmGraph>" >> opm.xml
+echo Finished generating OPM, in opm.xml
+


Property changes on: provenancedb/prov-to-opm.sh
___________________________________________________________________
Name: svn:executable
   + *




More information about the Swift-commit mailing list