[Swift-devel] Re: Swift jobs on UC/ANL TG
joseph insley
insley at mcs.anl.gov
Tue Jan 29 09:20:53 CST 2008
Yes, the local project id is typically created by swapping UC for TG
in the TG project id.
I believe these jobs are in fact PBS jobs submitted to the scheduler
through gram. However, pre-ws gram forks off a jobmanager process
for each job that is submitted, to keep track of state, etc. This is
a known limitation of pre-ws gram.
joe.
On Jan 29, 2008, at 9:01 AM, Mike Kubal wrote:
> Yes, I have a TG project. I should have updated my
> swift sites file after initial testing.
>
> Is it common for the TG project id to be the same as
> the local with just the 'TG-' and 'UC'- prefix
> switched?
>
>
> --- Ti Leggett <leggett at mcs.anl.gov> wrote:
>
>> Also, it looks like you're using a local project and
>> not a TG project.
>> We are not able to report this usage to NSF because
>> it counts against
>> our discretionary usage. Do you have a TG project?
>> If not can you
>> request a DACC allocation to use?
>>
>> On Jan 28, 2008, at 01/28/08 07:13 PM, Mike Kubal
>> wrote:
>>
>>> Yes, I'm submitting molecular dynamics simulations
>>> using Swift.
>>>
>>> Is there a default wall-time limit for jobs on
>> tg-uc?
>>>
>>>
>>>
>>> --- joseph insley <insley at mcs.anl.gov> wrote:
>>>
>>>> Actually, these numbers are now escalating...
>>>>
>>>> top - 17:18:54 up 2:29, 1 user, load average:
>>>> 149.02, 123.63, 91.94
>>>> Tasks: 469 total, 4 running, 465 sleeping, 0
>>>> stopped, 0 zombie
>>>>
>>>> insley at tg-grid1:~> ps -ef | grep kubal | wc -l
>>>> 479
>>>>
>>>> insley at tg-viz-login1:~> time globusrun -a -r
>>>> tg-grid.uc.teragrid.org
>>>> GRAM Authentication test successful
>>>> real 0m26.134s
>>>> user 0m0.090s
>>>> sys 0m0.010s
>>>>
>>>>
>>>> On Jan 28, 2008, at 5:15 PM, joseph insley wrote:
>>>>
>>>>> Earlier today tg-grid.uc.teragrid.org (the
>> UC/ANL
>>>> TG GRAM host)
>>>>> became unresponsive and had to be rebooted. I
>> am
>>>> now seeing slow
>>>>> response times from the Gatekeeper there again.
>>>> Authenticating to
>>>>> the gatekeeper should only take a second or two,
>>>> but it is
>>>>> periodically taking up to 16 seconds:
>>>>>
>>>>> insley at tg-viz-login1:~> time globusrun -a -r
>>>> tg-grid.uc.teragrid.org
>>>>> GRAM Authentication test successful
>>>>> real 0m16.096s
>>>>> user 0m0.060s
>>>>> sys 0m0.020s
>>>>>
>>>>> looking at the load on tg-grid, it is rather
>> high:
>>>>>
>>>>> top - 16:55:26 up 2:06, 1 user, load average:
>>>> 89.59, 78.69, 62.92
>>>>> Tasks: 398 total, 20 running, 378 sleeping, 0
>>>> stopped, 0 zombie
>>>>>
>>>>> And there appear to be a large number of
>> processes
>>>> owned by kubal:
>>>>> insley at tg-grid1:~> ps -ef | grep kubal | wc -l
>>>>> 380
>>>>>
>>>>> I assume that Mike is using swift to do the job
>>>> submission. Is
>>>>> there some throttling of the rate at which jobs
>>>> are submitted to
>>>>> the gatekeeper that could be done that would
>>>> lighten this load
>>>>> some? (Or has that already been done since
>>>> earlier today?) The
>>>>> current response times are not unacceptable, but
>>>> I'm hoping to
>>>>> avoid having the machine grind to a halt as it
>> did
>>>> earlier today.
>>>>>
>>>>> Thanks,
>>>>> joe.
>>>>>
>>>>>
>>>>>
>>>>
>> ===================================================
>>>>> joseph a.
>>>>> insley
>>>>
>>>>> insley at mcs.anl.gov
>>>>> mathematics & computer science division
>>>> (630) 252-5649
>>>>> argonne national laboratory
>>>> (630)
>>>>> 252-5986 (fax)
>>>>>
>>>>>
>>>>
>>>>
>> ===================================================
>>>> joseph a. insley
>>>>
>>>> insley at mcs.anl.gov
>>>> mathematics & computer science division
>> (630)
>>>> 252-5649
>>>> argonne national laboratory
>>>> (630)
>>>> 252-5986 (fax)
>>>>
>>>>
>>>>
>>>
>>>
>>>
>>>
>>>
>>
> ______________________________________________________________________
> ______________
>>> Be a better friend, newshound, and
>>> know-it-all with Yahoo! Mobile. Try it now.
>>
> http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ
>>>
>>
>>
>
>
>
>
> ______________________________________________________________________
> ______________
> Never miss a thing. Make Yahoo your home page.
> http://www.yahoo.com/r/hs
>
===================================================
joseph a. insley
insley at mcs.anl.gov
mathematics & computer science division (630) 252-5649
argonne national laboratory (630)
252-5986 (fax)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/swift-devel/attachments/20080129/9af7a87c/attachment.html>
More information about the Swift-devel
mailing list