[Swift-devel] Swift jobs on UC/ANL TG
joseph insley
insley at mcs.anl.gov
Mon Jan 28 17:15:41 CST 2008
Earlier today tg-grid.uc.teragrid.org (the UC/ANL TG GRAM host)
became unresponsive and had to be rebooted. I am now seeing slow
response times from the Gatekeeper there again. Authenticating to
the gatekeeper should only take a second or two, but it is
periodically taking up to 16 seconds:
insley at tg-viz-login1:~> time globusrun -a -r tg-grid.uc.teragrid.org
GRAM Authentication test successful
real 0m16.096s
user 0m0.060s
sys 0m0.020s
looking at the load on tg-grid, it is rather high:
top - 16:55:26 up 2:06, 1 user, load average: 89.59, 78.69, 62.92
Tasks: 398 total, 20 running, 378 sleeping, 0 stopped, 0 zombie
And there appear to be a large number of processes owned by kubal:
insley at tg-grid1:~> ps -ef | grep kubal | wc -l
380
I assume that Mike is using swift to do the job submission. Is there
some throttling of the rate at which jobs are submitted to the
gatekeeper that could be done that would lighten this load some? (Or
has that already been done since earlier today?) The current
response times are not unacceptable, but I'm hoping to avoid having
the machine grind to a halt as it did earlier today.
Thanks,
joe.
===================================================
joseph a. insley
insley at mcs.anl.gov
mathematics & computer science division (630) 252-5649
argonne national laboratory (630)
252-5986 (fax)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/swift-devel/attachments/20080128/63b0c817/attachment.html>
More information about the Swift-devel
mailing list