[Swift-devel] Re: Swift jobs on UC/ANL TG

Mike Kubal mikekubal at yahoo.com
Mon Feb 4 00:11:34 CST 2008


Sorry for killing the server. I'm pushing to get
results to guide the selection of compounds for
wet-lab testing.

I had set the throttle.score.job.factor to 1 in the
swift.properties file.

I certainly appreciate everyone's efforts and
responsiveness.

Let me know what to try next, before I kill again.
 
Cheers,

Mike 



--- Mihael Hategan <hategan at mcs.anl.gov> wrote:

> So I was trying some stuff on Friday night. I guess
> I've found the
> strategy on when to run the tests: when nobody else
> has jobs there
> (besides Buzz doing gridftp tests, Ioan having some
> Falkon workers
> running, and the occasional Inca tests).
> 
> In any event, the machine jumps to about 100%
> utilization at around 130
> jobs with pre-ws gram. So Mike, please set
> throttle.score.job.factor to
> 1 in swift.properties.
> 
> There's still more work I need to do test-wise.
> 
> On Sun, 2008-02-03 at 15:34 -0600, Ti Leggett wrote:
> > Mike, You're killing tg-grid1 again. Can someone
> work with Mike to get  
> > some swift settings that don't kill our server?
> > 
> > On Jan 28, 2008, at 7:13 PM, Mike Kubal wrote:
> > 
> > > Yes, I'm submitting molecular dynamics
> simulations
> > > using Swift.
> > >
> > > Is there a default wall-time limit for jobs on
> tg-uc?
> > >
> > >
> > >
> > > --- joseph insley <insley at mcs.anl.gov> wrote:
> > >
> > >> Actually, these numbers are now escalating...
> > >>
> > >> top - 17:18:54 up  2:29,  1 user,  load
> average:
> > >> 149.02, 123.63, 91.94
> > >> Tasks: 469 total,   4 running, 465 sleeping,  
> 0
> > >> stopped,   0 zombie
> > >>
> > >> insley at tg-grid1:~> ps -ef | grep kubal | wc -l
> > >>     479
> > >>
> > >> insley at tg-viz-login1:~> time globusrun -a -r
> > >> tg-grid.uc.teragrid.org
> > >> GRAM Authentication test successful
> > >> real    0m26.134s
> > >> user    0m0.090s
> > >> sys     0m0.010s
> > >>
> > >>
> > >> On Jan 28, 2008, at 5:15 PM, joseph insley
> wrote:
> > >>
> > >>> Earlier today tg-grid.uc.teragrid.org (the
> UC/ANL
> > >> TG GRAM host)
> > >>> became unresponsive and had to be rebooted.  I
> am
> > >> now seeing slow
> > >>> response times from the Gatekeeper there
> again.
> > >> Authenticating to
> > >>> the gatekeeper should only take a second or
> two,
> > >> but it is
> > >>> periodically taking up to 16 seconds:
> > >>>
> > >>> insley at tg-viz-login1:~> time globusrun -a -r
> > >> tg-grid.uc.teragrid.org
> > >>> GRAM Authentication test successful
> > >>> real    0m16.096s
> > >>> user    0m0.060s
> > >>> sys     0m0.020s
> > >>>
> > >>> looking at the load on tg-grid, it is rather
> high:
> > >>>
> > >>> top - 16:55:26 up  2:06,  1 user,  load
> average:
> > >> 89.59, 78.69, 62.92
> > >>> Tasks: 398 total,  20 running, 378 sleeping,  
> 0
> > >> stopped,   0 zombie
> > >>>
> > >>> And there appear to be a large number of
> processes
> > >> owned by kubal:
> > >>> insley at tg-grid1:~> ps -ef | grep kubal | wc -l
> > >>>    380
> > >>>
> > >>> I assume that Mike is using swift to do the
> job
> > >> submission.  Is
> > >>> there some throttling of the rate at which
> jobs
> > >> are submitted to
> > >>> the gatekeeper that could be done that would
> > >> lighten this load
> > >>> some?  (Or has that already been done since
> > >> earlier today?)  The
> > >>> current response times are not unacceptable,
> but
> > >> I'm hoping to
> > >>> avoid having the machine grind to a halt as it
> did
> > >> earlier today.
> > >>>
> > >>> Thanks,
> > >>> joe.
> > >>>
> > >>>
> > >>>
> > >>
> ===================================================
> > >>> joseph a.
> > >>> insley
> > >>
> > >>> insley at mcs.anl.gov
> > >>> mathematics & computer science division
> > >> (630) 252-5649
> > >>> argonne national laboratory
> > >>       (630)
> > >>> 252-5986 (fax)
> > >>>
> > >>>
> > >>
> > >>
> ===================================================
> > >> joseph a. insley
> > >>
> > >> insley at mcs.anl.gov
> > >> mathematics & computer science division      
> (630)
> > >> 252-5649
> > >> argonne national laboratory
> > >>     (630)
> > >> 252-5986 (fax)
> > >>
> > >>
> > >>
> > >
> > >
> > >
> > >       
> > >
>
____________________________________________________________________________________
> > > Be a better friend, newshound, and
> > > know-it-all with Yahoo! Mobile.  Try it now. 
>
http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ
> > >
> > 
> > _______________________________________________
> > Swift-devel mailing list
> > Swift-devel at ci.uchicago.edu
> >
>
http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
> > 
> 
> _______________________________________________
> Swift-devel mailing list
> Swift-devel at ci.uchicago.edu
>
http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
> 
> 



      ____________________________________________________________________________________
Never miss a thing.  Make Yahoo your home page. 
http://www.yahoo.com/r/hs



More information about the Swift-devel mailing list