[Swift-user] Swift is stuck with 5K jobs

Michael Wilde wilde at mcs.anl.gov
Mon Mar 14 12:45:09 CDT 2011


Andriy, All,

On systems like TeraGrid hosts where the login hosts are frequently heavily loaded, we should verify that you can obtain a single interactive compute node via qsub -I on which to run the swift command (ideally under screen to make re-attachment easy) and that from there Swift can run jobs using the Coaster-over-PBS provider configuration.

I suspect (and hope) that any cluster node on say abe, queenbee, and ranger can also run qsub and qstat.  We should test and document that, but in the meantime, Andriy, can you try that approach? I *think* that it should be identical to running from a login host.

What I want to avoid is causing too heavy a load on any login host and in the process getting Swift banned or having it associated with causing system problems.

Thanks and regards,

- Mike


----- Original Message -----
> On Mon, 2011-03-14 at 11:06 -0400, Andriy Fedorov wrote:
> > Am I hitting some limit? Is 5K jobs too much?
> 
> Shouldn't be, but if you have the coaster service running in local
> mode,
> that might do the trick.
> 
> >
> > How do I terminate swift now not to waste cycles of the head node?
> 
> kill -9 <pidOfJavaProcess>
> 
> 
> _______________________________________________
> Swift-user mailing list
> Swift-user at ci.uchicago.edu
> http://mail.ci.uchicago.edu/mailman/listinfo/swift-user

-- 
Michael Wilde
Computation Institute, University of Chicago
Mathematics and Computer Science Division
Argonne National Laboratory




More information about the Swift-user mailing list