[Swift-user] scripts and tools to visualize swift-info timings
Allan Espinosa
aespinosa at cs.uchicago.edu
Fri Dec 4 17:53:09 CST 2009
scriptname: info2csv.rb
#!/usr/bin/env ruby
require 'time'
require 'csv'
outfile = File.open("/tmp/tmp.csv", "w")
outfile.puts "JOB,LOG_START,LOG_START,CREATE_JOBDIR,CREATE_INPUTDIR,LINK_INPUTS,EXECUTE,EXECUTE_DONE,MOVING_OUTPUTS,RM_JOBDIR,
END_TIME"
min = Float::MAX
Dir.glob("*/*info").each do |info|
job = File.basename(info)
outfile.print job
prev = 0
File.open(info).readlines.grep(/Progress/).each do |field|
field.chomp!
time = Time.parse(field[10..43])
diff = (time - prev).to_f
outfile.print ",#{diff}"
prev = time
min = [min, diff].min if field[46..-1] == "LOG_START"
end
outfile.print "\n"
end
outfile.close
# Do some spreadsheet flexing
CSV::Reader.parse(File.open("/tmp/tmp.csv", "r")) do |row|
row[1] = (row[1].to_f - min).to_s if row[1] != "LOG_START"
sum = 0.0
row[1..-1].each {|i| sum += i.to_f} if row[1] != "LOG_START"
row << sum.to_s if row[1] != "LOG_START"
puts row.join(",")
end
Usage:
1. go to your $WORKDIR/info directory
2. run script: $info2csv.rb > time.csv
3. upload time to google spreadsheets.
4. paste spreadsheet sharing URL to the form
http://www.ci.uchicago.edu/~aespinosa/swift/info_times.html. Page
includes a sample spreadsheet in running a loosely coupled NCBI blast
workflow.
Notes: info_times.html only plots the first 100 jobs sorted according
to the time it finished. You can make your own vis by manually
editing the google query in a personal copy of the script.
Enjoy!
-Allan
--
Allan M. Espinosa <http://allan.88-mph.net/blog>
PhD student, Computer Science
University of Chicago <http://people.cs.uchicago.edu/~aespinosa>
More information about the Swift-user
mailing list