[Swift-devel] Re: osg question: how to find sites' health

Daniel S. Katz dsk at ci.uchicago.edu
Tue May 3 09:53:50 CDT 2011


I think this would be a useful thing for Allan work on under ExTENCI, after this week, and secondary to getting the SCEC code running and tested.

I'm also trying to get more ExTENCI funds, and a general purpose tool might be a good deliverable for that.

Dan


On May 2, 2011, at 6:29 PM, Michael Wilde wrote:

> Ketan, lets discuss this more tomorrow and report our progress back to the list.
> 
> I think "health" is a hard term to define for grid sites. At any given time, each service on each site is either working or not.
> 
> - the sites file builder can do various checks
> - the checks need to be done under the user's cert to be meaningful
> - swift needs to recover from what doesnt get caught by the sites file builder
> - clean reporting of errors helps OSG site admins catch and fix problems
> 
> I'd like to see Allan's tools merged/extended with a few others, packaged with Swift and documented and tested for Swift users.
> 
> Mike
> 
> 
> ----- Original Message -----
>> Hi Ketan,
>> 
>> Most of the time i just query the ReSS condor pool (condor_status
>> -pool engage-central.renci.org) the look for the following classads:
>> 
>> GlueCEInfoTotalCPUs
>> GlueCEInfo*Jobs* <= jobs running, total acceptable jobs, free cores,
>> etc.
>> 
>> The OSG monitoring webpages (gratia, rsv) also has related
>> information.
>> 
>> 2011/5/2 Ketan Maheshwari <ketan at mcs.anl.gov>:
>>> Hi Allan,
>>> 
>>> I am trying to reuse your work on OSG that you did for extenci. So
>>> am using your scripts from allantools/..
>>> 
>>> A quick question about OSG: How do you find the health of
>>> participating sites?
>>> 
>>> On EGI we have something called "lcg-infosites" series of commands
>>> that do this.
>>> 
>>> Thanks,
>>> Ketan
>>> 
>>> 
>>> 
>> 
>> 
>> 
>> --
>> Allan M. Espinosa <http://amespinosa.wordpress.com>
>> PhD student, Computer Science
>> University of Chicago <http://people.cs.uchicago.edu/~aespinosa>
>> _______________________________________________
>> Swift-devel mailing list
>> Swift-devel at ci.uchicago.edu
>> http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
> 
> -- 
> Michael Wilde
> Computation Institute, University of Chicago
> Mathematics and Computer Science Division
> Argonne National Laboratory
> 
> _______________________________________________
> Swift-devel mailing list
> Swift-devel at ci.uchicago.edu
> http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel

-- 
Daniel S. Katz
University of Chicago
(773) 834-7186 (voice)
(773) 834-6818 (fax)
d.katz at ieee.org or dsk at ci.uchicago.edu
http://www.ci.uchicago.edu/~dsk/







More information about the Swift-devel mailing list