[Swift-devel] Re: Swift jobs on UC/ANL TG
Ian Foster
foster at mcs.anl.gov
Sun Feb 3 21:12:08 CST 2008
Mihael:
Is there any chance you can try GRAM4, as was requested early last week?
Ian.
Mihael Hategan wrote:
> So I was trying some stuff on Friday night. I guess I've found the
> strategy on when to run the tests: when nobody else has jobs there
> (besides Buzz doing gridftp tests, Ioan having some Falkon workers
> running, and the occasional Inca tests).
>
> In any event, the machine jumps to about 100% utilization at around 130
> jobs with pre-ws gram. So Mike, please set throttle.score.job.factor to
> 1 in swift.properties.
>
> There's still more work I need to do test-wise.
>
> On Sun, 2008-02-03 at 15:34 -0600, Ti Leggett wrote:
>
>> Mike, You're killing tg-grid1 again. Can someone work with Mike to get
>> some swift settings that don't kill our server?
>>
>> On Jan 28, 2008, at 7:13 PM, Mike Kubal wrote:
>>
>>
>>> Yes, I'm submitting molecular dynamics simulations
>>> using Swift.
>>>
>>> Is there a default wall-time limit for jobs on tg-uc?
>>>
>>>
>>>
>>> --- joseph insley <insley at mcs.anl.gov> wrote:
>>>
>>>
>>>> Actually, these numbers are now escalating...
>>>>
>>>> top - 17:18:54 up 2:29, 1 user, load average:
>>>> 149.02, 123.63, 91.94
>>>> Tasks: 469 total, 4 running, 465 sleeping, 0
>>>> stopped, 0 zombie
>>>>
>>>> insley at tg-grid1:~> ps -ef | grep kubal | wc -l
>>>> 479
>>>>
>>>> insley at tg-viz-login1:~> time globusrun -a -r
>>>> tg-grid.uc.teragrid.org
>>>> GRAM Authentication test successful
>>>> real 0m26.134s
>>>> user 0m0.090s
>>>> sys 0m0.010s
>>>>
>>>>
>>>> On Jan 28, 2008, at 5:15 PM, joseph insley wrote:
>>>>
>>>>
>>>>> Earlier today tg-grid.uc.teragrid.org (the UC/ANL
>>>>>
>>>> TG GRAM host)
>>>>
>>>>> became unresponsive and had to be rebooted. I am
>>>>>
>>>> now seeing slow
>>>>
>>>>> response times from the Gatekeeper there again.
>>>>>
>>>> Authenticating to
>>>>
>>>>> the gatekeeper should only take a second or two,
>>>>>
>>>> but it is
>>>>
>>>>> periodically taking up to 16 seconds:
>>>>>
>>>>> insley at tg-viz-login1:~> time globusrun -a -r
>>>>>
>>>> tg-grid.uc.teragrid.org
>>>>
>>>>> GRAM Authentication test successful
>>>>> real 0m16.096s
>>>>> user 0m0.060s
>>>>> sys 0m0.020s
>>>>>
>>>>> looking at the load on tg-grid, it is rather high:
>>>>>
>>>>> top - 16:55:26 up 2:06, 1 user, load average:
>>>>>
>>>> 89.59, 78.69, 62.92
>>>>
>>>>> Tasks: 398 total, 20 running, 378 sleeping, 0
>>>>>
>>>> stopped, 0 zombie
>>>>
>>>>> And there appear to be a large number of processes
>>>>>
>>>> owned by kubal:
>>>>
>>>>> insley at tg-grid1:~> ps -ef | grep kubal | wc -l
>>>>> 380
>>>>>
>>>>> I assume that Mike is using swift to do the job
>>>>>
>>>> submission. Is
>>>>
>>>>> there some throttling of the rate at which jobs
>>>>>
>>>> are submitted to
>>>>
>>>>> the gatekeeper that could be done that would
>>>>>
>>>> lighten this load
>>>>
>>>>> some? (Or has that already been done since
>>>>>
>>>> earlier today?) The
>>>>
>>>>> current response times are not unacceptable, but
>>>>>
>>>> I'm hoping to
>>>>
>>>>> avoid having the machine grind to a halt as it did
>>>>>
>>>> earlier today.
>>>>
>>>>> Thanks,
>>>>> joe.
>>>>>
>>>>>
>>>>>
>>>>>
>>>> ===================================================
>>>>
>>>>> joseph a.
>>>>> insley
>>>>>
>>>>> insley at mcs.anl.gov
>>>>> mathematics & computer science division
>>>>>
>>>> (630) 252-5649
>>>>
>>>>> argonne national laboratory
>>>>>
>>>> (630)
>>>>
>>>>> 252-5986 (fax)
>>>>>
>>>>>
>>>>>
>>>> ===================================================
>>>> joseph a. insley
>>>>
>>>> insley at mcs.anl.gov
>>>> mathematics & computer science division (630)
>>>> 252-5649
>>>> argonne national laboratory
>>>> (630)
>>>> 252-5986 (fax)
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>> ____________________________________________________________________________________
>>> Be a better friend, newshound, and
>>> know-it-all with Yahoo! Mobile. Try it now. http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ
>>>
>>>
>> _______________________________________________
>> Swift-devel mailing list
>> Swift-devel at ci.uchicago.edu
>> http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
>>
>>
>
> _______________________________________________
> Swift-devel mailing list
> Swift-devel at ci.uchicago.edu
> http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/swift-devel/attachments/20080203/533fe40c/attachment.html>
More information about the Swift-devel
mailing list