[Swift-devel] Formatting kickstart records

Michael Wilde wilde at mcs.anl.gov
Mon Sep 10 16:13:02 CDT 2007


Brian, Suchandra,

In the Swift workflow system we use a tool (developed as part of the 
GriPhyN Virtual Data System VDS) called "kickstart" to launch programs 
on Grid sites and record their execution environment, exit status and 
resource consumption as an XML document. Kickstart is a simple utility 
runs the real app as a child process and records its info; it usually 
run under GRAM.

We're now developing some tools to flexibly format and and query this 
info, as its very useful in troubleshooting.

Ian suggests below that we should integrate this mechanism with a 
CEDPS-defined log format. Ive looked at the CEDPS wiki, but have not yet 
found where this format is defined and how it relates to the myriad 
sources of gird log info.

Could you point us to some information on this format and how you 
envision that tools like kickstart would integrate with it?

Any thoughts you have on how we should approach logging in Swift would 
be welcome as well.

For reference: Swift is described at www.ci.uchicago.edu/swift;
A sample kickstart record is at:
http://www.ci.uchicago.edu/~wilde/runs/uc.teragrid/runs/run23/kickstart/angle4-0aac5zgi-kickstart.xml
A paper on kickstart is at:
http://www.ci.uchicago.edu/swift/papers/Kickstarting2006.pdf

Thanks,

Mike


Ian Foster wrote:
> is this format in the "standard log format" that CEDPS defined? I would 
> think it should be.
> 
> Michael Wilde wrote:
>> I started a small perl script to format kickstart records.
>>
>> For now, for each kickstart output file on the cmd line it prints 
>> one-liners of the form:
>>
>> Start=2007-09-09T00:24:43.161-05:00 duration=83.993 user=82.470 
>> sys=1.360 machine=i686 host=tg-v050.uc.teragrid.org
>>
>> but its pretty generalizable.
>>
>> I did this to find the min, max and stats on run times, looking for 
>> outliers that are holding up the workflow.
>>
>> If anyone has a similar/better tool please point it out, otherwise 
>> I'll continue to enhance this. Suggestions for a better approach are 
>> welcome.
>>
>> Pavel (summer student) did something like this in C a while back; I 
>> thought that was added to vds/contrib at that time but I dont see it 
>> in the latest VDS release.  Need to hunt it down.
>>
>> This perl script is very simple and easy, but rather slow on 1000 
>> kickstart records (need to get timings; Im sure it can be improved, 
>> possibly by grabbing multiple fields on each XPath call.
>>
>> - Mike
>>
>> $ cat ~/vds/kix
>> #!/usr/bin/perl -w -I/home/wilde/vds
>> #                  ^^^ How to best set the module path?
>>
>> # Print fields from a list of invocation record xml files
>>
>> use strict;
>> use XML::XPath;
>> use XML::XPath::XMLParser;
>>
>> while(@ARGV) {
>>   print_irec(shift @ARGV);
>> }
>>
>> sub print_irec
>> {
>>   my $irec = shift;
>>   my $xp = XML::XPath->new(filename => $irec);
>>
>>   my $start = $xp->findvalue('/invocation/mainjob/@start');
>>   my $utime = $xp->findvalue('/invocation/mainjob/usage/@utime');
>>   my $stime = $xp->findvalue('/invocation/mainjob/usage/@stime');
>>   my $duration = $xp->findvalue('/invocation/mainjob/@duration');
>>   my $machine = $xp->findvalue('/invocation/uname/@machine');
>>   my $host = $xp->findvalue('/invocation/@hostname');
>>
>>   print "Start=$start duration=$duration user=$utime sys=$stime 
>> machine=$machine host=$host\n";
>> }
>>
>> _______________________________________________
>> Swift-devel mailing list
>> Swift-devel at ci.uchicago.edu
>> http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
>>
> 



More information about the Swift-devel mailing list