[Swift-devel] Re: Swift jobs on UC/ANL TG

Ti Leggett leggett at mcs.anl.gov
Tue Jan 29 10:01:00 CST 2008


I'm going to remove the local project mapping then.

On Jan 29, 2008, at 01/29/08 09:20 AM, joseph insley wrote:

> Yes, the local project id is typically created by swapping UC for TG  
> in the TG project id.
>
> I believe these jobs are in fact PBS jobs submitted to the scheduler  
> through gram.  However, pre-ws gram forks off a jobmanager process  
> for each job that is submitted, to keep track of state, etc.  This  
> is a known limitation of pre-ws gram.
>
> joe.
>
> On Jan 29, 2008, at 9:01 AM, Mike Kubal wrote:
>
>> Yes, I have a TG project. I should have updated my
>> swift sites file after initial testing.
>>
>> Is it common for the TG project id to be the same as
>> the local with just the 'TG-' and 'UC'- prefix
>> switched?
>>
>>
>> --- Ti Leggett <leggett at mcs.anl.gov> wrote:
>>
>>> Also, it looks like you're using a local project and
>>> not a TG project.
>>> We are not able to report this usage to NSF because
>>> it counts against
>>> our discretionary usage. Do you have a TG project?
>>> If not can you
>>> request a DACC allocation to use?
>>>
>>> On Jan 28, 2008, at 01/28/08 07:13 PM, Mike Kubal
>>> wrote:
>>>
>>>> Yes, I'm submitting molecular dynamics simulations
>>>> using Swift.
>>>>
>>>> Is there a default wall-time limit for jobs on
>>> tg-uc?
>>>>
>>>>
>>>>
>>>> --- joseph insley <insley at mcs.anl.gov> wrote:
>>>>
>>>>> Actually, these numbers are now escalating...
>>>>>
>>>>> top - 17:18:54 up  2:29,  1 user,  load average:
>>>>> 149.02, 123.63, 91.94
>>>>> Tasks: 469 total,   4 running, 465 sleeping,   0
>>>>> stopped,   0 zombie
>>>>>
>>>>> insley at tg-grid1:~> ps -ef | grep kubal | wc -l
>>>>>     479
>>>>>
>>>>> insley at tg-viz-login1:~> time globusrun -a -r
>>>>> tg-grid.uc.teragrid.org
>>>>> GRAM Authentication test successful
>>>>> real    0m26.134s
>>>>> user    0m0.090s
>>>>> sys     0m0.010s
>>>>>
>>>>>
>>>>> On Jan 28, 2008, at 5:15 PM, joseph insley wrote:
>>>>>
>>>>>> Earlier today tg-grid.uc.teragrid.org (the
>>> UC/ANL
>>>>> TG GRAM host)
>>>>>> became unresponsive and had to be rebooted.  I
>>> am
>>>>> now seeing slow
>>>>>> response times from the Gatekeeper there again.
>>>>> Authenticating to
>>>>>> the gatekeeper should only take a second or two,
>>>>> but it is
>>>>>> periodically taking up to 16 seconds:
>>>>>>
>>>>>> insley at tg-viz-login1:~> time globusrun -a -r
>>>>> tg-grid.uc.teragrid.org
>>>>>> GRAM Authentication test successful
>>>>>> real    0m16.096s
>>>>>> user    0m0.060s
>>>>>> sys     0m0.020s
>>>>>>
>>>>>> looking at the load on tg-grid, it is rather
>>> high:
>>>>>>
>>>>>> top - 16:55:26 up  2:06,  1 user,  load average:
>>>>> 89.59, 78.69, 62.92
>>>>>> Tasks: 398 total,  20 running, 378 sleeping,   0
>>>>> stopped,   0 zombie
>>>>>>
>>>>>> And there appear to be a large number of
>>> processes
>>>>> owned by kubal:
>>>>>> insley at tg-grid1:~> ps -ef | grep kubal | wc -l
>>>>>>    380
>>>>>>
>>>>>> I assume that Mike is using swift to do the job
>>>>> submission.  Is
>>>>>> there some throttling of the rate at which jobs
>>>>> are submitted to
>>>>>> the gatekeeper that could be done that would
>>>>> lighten this load
>>>>>> some?  (Or has that already been done since
>>>>> earlier today?)  The
>>>>>> current response times are not unacceptable, but
>>>>> I'm hoping to
>>>>>> avoid having the machine grind to a halt as it
>>> did
>>>>> earlier today.
>>>>>>
>>>>>> Thanks,
>>>>>> joe.
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>> ===================================================
>>>>>> joseph a.
>>>>>> insley
>>>>>
>>>>>> insley at mcs.anl.gov
>>>>>> mathematics & computer science division
>>>>> (630) 252-5649
>>>>>> argonne national laboratory
>>>>>       (630)
>>>>>> 252-5986 (fax)
>>>>>>
>>>>>>
>>>>>
>>>>>
>>> ===================================================
>>>>> joseph a. insley
>>>>>
>>>>> insley at mcs.anl.gov
>>>>> mathematics & computer science division
>>> (630)
>>>>> 252-5649
>>>>> argonne national laboratory
>>>>>     (630)
>>>>> 252-5986 (fax)
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>> ____________________________________________________________________________________
>>>> Be a better friend, newshound, and
>>>> know-it-all with Yahoo! Mobile.  Try it now.
>>>
>> http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ
>>>>
>>>
>>>
>>
>>
>>
>>        
>> ____________________________________________________________________________________
>> Never miss a thing.  Make Yahoo your home page.
>> http://www.yahoo.com/r/hs
>>
>
> ===================================================
> joseph a.  
> insley                                                      insley at mcs.anl.gov
> mathematics & computer science division       (630) 252-5649
> argonne national laboratory                               (630)  
> 252-5986 (fax)
>
>




More information about the Swift-devel mailing list