[Swift-user] scripts and tools to visualize swift-info timings

Allan Espinosa aespinosa at cs.uchicago.edu
Fri Dec 4 17:53:09 CST 2009


scriptname: info2csv.rb
#!/usr/bin/env ruby

require 'time'
require 'csv'

outfile = File.open("/tmp/tmp.csv", "w")
outfile.puts  "JOB,LOG_START,LOG_START,CREATE_JOBDIR,CREATE_INPUTDIR,LINK_INPUTS,EXECUTE,EXECUTE_DONE,MOVING_OUTPUTS,RM_JOBDIR,
END_TIME"
min = Float::MAX
Dir.glob("*/*info").each do |info|
  job = File.basename(info)
  outfile.print job
  prev = 0
  File.open(info).readlines.grep(/Progress/).each do |field|
    field.chomp!
    time = Time.parse(field[10..43])
    diff = (time - prev).to_f
    outfile.print ",#{diff}"
    prev = time
    min = [min, diff].min if field[46..-1] == "LOG_START"
  end
  outfile.print "\n"
end
outfile.close

# Do some spreadsheet flexing
CSV::Reader.parse(File.open("/tmp/tmp.csv", "r")) do |row|
  row[1] = (row[1].to_f - min).to_s if row[1] != "LOG_START"
  sum = 0.0
  row[1..-1].each {|i| sum += i.to_f} if row[1] != "LOG_START"
  row << sum.to_s if row[1] != "LOG_START"
  puts row.join(",")
end

Usage:
1.  go to your $WORKDIR/info directory
2.  run script: $info2csv.rb > time.csv
3.  upload time to google spreadsheets.
4.  paste spreadsheet sharing URL to the form
http://www.ci.uchicago.edu/~aespinosa/swift/info_times.html.   Page
includes a sample spreadsheet in running a loosely coupled NCBI blast
workflow.

Notes: info_times.html only plots the first 100 jobs sorted according
to the time it finished.  You can make your own vis by manually
editing the google query in a personal copy of the script.

Enjoy!
-Allan

-- 
Allan M. Espinosa <http://allan.88-mph.net/blog>
PhD student, Computer Science
University of Chicago <http://people.cs.uchicago.edu/~aespinosa>



More information about the Swift-user mailing list