[Swift-devel] Formatting kickstart records

Brian Tierney bltierney at lbl.gov
Tue Sep 11 11:47:11 CDT 2007


Hi Mike:

The logging document is here:

http://www.cedps.net/wiki/index.php/LoggingBestPractices

and a paper describing the higher-level vision is here:

http://www.cedps.net/wiki/images/e/ec/Grid2007.pdf

It should be quite easy to convert your logs to the CEDPS format.

Let's read each others papers and then talk more.




Michael Wilde wrote:
> Brian, Suchandra,
> 
> In the Swift workflow system we use a tool (developed as part of the 
> GriPhyN Virtual Data System VDS) called "kickstart" to launch programs 
> on Grid sites and record their execution environment, exit status and 
> resource consumption as an XML document. Kickstart is a simple utility 
> runs the real app as a child process and records its info; it usually 
> run under GRAM.
> 
> We're now developing some tools to flexibly format and and query this 
> info, as its very useful in troubleshooting.
> 
> Ian suggests below that we should integrate this mechanism with a 
> CEDPS-defined log format. Ive looked at the CEDPS wiki, but have not yet 
> found where this format is defined and how it relates to the myriad 
> sources of gird log info.
> 
> Could you point us to some information on this format and how you 
> envision that tools like kickstart would integrate with it?
> 
> Any thoughts you have on how we should approach logging in Swift would 
> be welcome as well.
> 
> For reference: Swift is described at www.ci.uchicago.edu/swift;
> A sample kickstart record is at:
> http://www.ci.uchicago.edu/~wilde/runs/uc.teragrid/runs/run23/kickstart/angle4-0aac5zgi-kickstart.xml 
> 
> A paper on kickstart is at:
> http://www.ci.uchicago.edu/swift/papers/Kickstarting2006.pdf
> 
> Thanks,
> 
> Mike
> 
> 
> Ian Foster wrote:
>> is this format in the "standard log format" that CEDPS defined? I 
>> would think it should be.
>>
>> Michael Wilde wrote:
>>> I started a small perl script to format kickstart records.
>>>
>>> For now, for each kickstart output file on the cmd line it prints 
>>> one-liners of the form:
>>>
>>> Start=2007-09-09T00:24:43.161-05:00 duration=83.993 user=82.470 
>>> sys=1.360 machine=i686 host=tg-v050.uc.teragrid.org
>>>
>>> but its pretty generalizable.
>>>
>>> I did this to find the min, max and stats on run times, looking for 
>>> outliers that are holding up the workflow.
>>>
>>> If anyone has a similar/better tool please point it out, otherwise 
>>> I'll continue to enhance this. Suggestions for a better approach are 
>>> welcome.
>>>
>>> Pavel (summer student) did something like this in C a while back; I 
>>> thought that was added to vds/contrib at that time but I dont see it 
>>> in the latest VDS release.  Need to hunt it down.
>>>
>>> This perl script is very simple and easy, but rather slow on 1000 
>>> kickstart records (need to get timings; Im sure it can be improved, 
>>> possibly by grabbing multiple fields on each XPath call.
>>>
>>> - Mike
>>>
>>> $ cat ~/vds/kix
>>> #!/usr/bin/perl -w -I/home/wilde/vds
>>> #                  ^^^ How to best set the module path?
>>>
>>> # Print fields from a list of invocation record xml files
>>>
>>> use strict;
>>> use XML::XPath;
>>> use XML::XPath::XMLParser;
>>>
>>> while(@ARGV) {
>>>   print_irec(shift @ARGV);
>>> }
>>>
>>> sub print_irec
>>> {
>>>   my $irec = shift;
>>>   my $xp = XML::XPath->new(filename => $irec);
>>>
>>>   my $start = $xp->findvalue('/invocation/mainjob/@start');
>>>   my $utime = $xp->findvalue('/invocation/mainjob/usage/@utime');
>>>   my $stime = $xp->findvalue('/invocation/mainjob/usage/@stime');
>>>   my $duration = $xp->findvalue('/invocation/mainjob/@duration');
>>>   my $machine = $xp->findvalue('/invocation/uname/@machine');
>>>   my $host = $xp->findvalue('/invocation/@hostname');
>>>
>>>   print "Start=$start duration=$duration user=$utime sys=$stime 
>>> machine=$machine host=$host\n";
>>> }
>>>
>>> _______________________________________________
>>> Swift-devel mailing list
>>> Swift-devel at ci.uchicago.edu
>>> http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
>>>
>>

-- 
------------------------------------------------------------------------
   Brian L. Tierney,   Lawrence Berkeley National Laboratory (LBNL)
   1 Cyclotron Rd.  MS: 50B-2239,  Berkeley, CA  94720
   tel: 510-486-7381    fax: 510-495-2998   efax: 425-642-4558
   bltierney at lbl.gov   http://www-didc.lbl.gov/~tierney
------------------------------------------------------------------------



More information about the Swift-devel mailing list