[Swift-commit] r2827 - usertools/cio/science/blast

noreply at svn.ci.uchicago.edu noreply at svn.ci.uchicago.edu
Sun Apr 5 00:12:44 CDT 2009


Author: aespinosa
Date: 2009-04-05 00:12:43 -0500 (Sun, 05 Apr 2009)
New Revision: 2827

Added:
   usertools/cio/science/blast/measure-runblast.sh
   usertools/cio/science/blast/readseq.rb
   usertools/cio/science/blast/runblast.sh
Log:
Initial commit of test framework

Added: usertools/cio/science/blast/measure-runblast.sh
===================================================================
--- usertools/cio/science/blast/measure-runblast.sh	                        (rev 0)
+++ usertools/cio/science/blast/measure-runblast.sh	2009-04-05 05:12:43 UTC (rev 2827)
@@ -0,0 +1 @@
+./swiftblast.sh - UNIPROT_for_blast_14.0.seq 100001 100010


Property changes on: usertools/cio/science/blast/measure-runblast.sh
___________________________________________________________________
Name: svn:executable
   + *

Added: usertools/cio/science/blast/readseq.rb
===================================================================
--- usertools/cio/science/blast/readseq.rb	                        (rev 0)
+++ usertools/cio/science/blast/readseq.rb	2009-04-05 05:12:43 UTC (rev 2827)
@@ -0,0 +1,33 @@
+#!/usr/bin/env ruby
+#
+# Script: readseq.rb
+# Description: Parses the a BLAST Fasta file and dumps each sequence to a 
+#              file.
+
+require 'fileutils'
+
+
+fasta_db  = File.new(ARGV[0])
+seq_start = ARGV[1].to_i
+seq_end   = ARGV[2].to_i
+prefix    = ARGV[3]            # Output dir prefix
+pad       = ARGV[4]            # Limit files per directory? "l"
+
+while true
+  x = fasta_db.readline("\n>").sub(/>$/, "")
+  if fasta_db.eof or fasta_db.lineno > seq_end
+    break
+  end
+  x =~ />(.*)\n/
+  seqname = $1
+  if seq_start <= fasta_db.lineno
+    dir = pad == "l" ? prefix +
+	    sprintf("/%04d", fasta_db.lineno / 1000) : prefix
+    fname = sprintf "SEQ%07d_%s.qry", fasta_db.lineno, seqname
+	FileUtils.mkdir_p dir
+	file = File.new("#{dir}/#{fname}","w")
+	file << x
+	file.close
+  end
+  fasta_db.ungetc ?>
+end

Added: usertools/cio/science/blast/runblast.sh
===================================================================
--- usertools/cio/science/blast/runblast.sh	                        (rev 0)
+++ usertools/cio/science/blast/runblast.sh	2009-04-05 05:12:43 UTC (rev 2827)
@@ -0,0 +1,37 @@
+#!/bin/bash
+
+# Script: swiftblast.sh [blast_db] [start_seq] [end_seq] [outdir]
+#         Invokes the swift workflow of reciprocal blast
+
+if [ $# -lt 3 ]; then
+  cat << EOF
+ERROR   :  too few arguments
+Usage   :  $0 [blast_db] [start_seq] [end_seq] [outdir]
+Example :  Reciprocal blast on the 100th to 200th sequences
+of UNIPRT.seq and dump results to run_100:
+$0 UNIPROT.seq 100 200 run_100
+EOF
+fi
+
+FALKON_ID=$1
+BLAST_DB=$2
+START_SEQ=$3
+END_SEQ=$4
+
+# Limit to 1k sequences per dir "l"
+LIMIT=$5
+
+BLASTROOT=$CIOROOT/science/blast
+
+# Extract sequences
+ruby readseq.rb $BLAST_DB $START_SEQ $END_SEQ seqs
+
+if [ $FALKON_ID != "-" ]; then # Real Falkon job
+  sleep 0 
+  # Regenerate tc.data and sites.xml @BGP@
+else
+  sleep 0 
+fi
+
+# Run swift:
+#swift -sites.file sites.xml -tc.file tc.data $BLASTROOT/blast.swift


Property changes on: usertools/cio/science/blast/runblast.sh
___________________________________________________________________
Name: svn:executable
   + *




More information about the Swift-commit mailing list