[Swift-user] Swift is stuck with 5K jobs

Mihael Hategan hategan at mcs.anl.gov
Mon Mar 14 12:26:53 CDT 2011


The whole log would probably help here.

Mihael

On Mon, 2011-03-14 at 11:28 -0400, Andriy Fedorov wrote:
> Thanks, Allan. Now I have a different exception:
> 
> class org.globus.cog.abstraction.impl.file.coaster.buffers.NIOChannelReadBuffer
> throws exception in doStuff. Fix it!
> java.lang.NullPointerException
> 	at org.globus.cog.abstraction.impl.file.coaster.commands.PutFileCommand.error(PutFileCommand.java:95)
> 	at org.globus.cog.abstraction.impl.file.coaster.buffers.ReadBuffer.error(ReadBuffer.java:79)
> 	at org.globus.cog.abstraction.impl.file.coaster.buffers.NIOChannelReadBuffer.doStuff(NIOChannelReadBuffer.java:42)
> 	at org.globus.cog.abstraction.impl.file.coaster.buffers.Buffers.run(Buffers.java:133)
> 
> 
> 
> On Mon, Mar 14, 2011 at 11:15, Allan Espinosa <aespinosa at cs.uchicago.edu> wrote:
> > Hello Andriy,
> >
> > The default package may have a small max heap limit.  Usually, I apply
> > this patch whenever I get a new version of Swift:
> >
> > --- old/bin/swift       2010-10-12 12:18:47.000000000 -0500
> > +++ new/bin/swift       2010-10-12 12:18:37.000000000 -0500
> > @@ -9,7 +9,7 @@
> >
> >  CYGWIN=
> >  CPDELIM=":"
> > -HEAPMAX=256M
> > +HEAPMAX=4096M
> >
> >  if echo `uname` | grep -i "cygwin"; then
> >   CYGWIN="yes"
> >
> >
> > Works well with 800K jobs.
> >
> > -Allan
> >
> > 2011/3/14 Andriy Fedorov <fedorov at bwh.harvard.edu>:
> >> Hi,
> >>
> >> I am using swift with coasters on NCSA Abe. I use binary build of
> >> swift 0.92. My script should generate about 5K individual jobs. When I
> >> try to run it, I have
> >>
> >> Swift svn swift-r4157 cog-r3056
> >>
> >> RunID: 20110314-0951-f3c45zja
> >> Progress:
> >> Exception in thread "Timer-0" java.lang.OutOfMemoryError: Java heap space
> >>
> >> Exception in thread "SIGINT handler"
> >> Exception in thread "SIGINT handler" Exception in thread "SIGTERM handler"
> >>
> >> After this error, I am not able to terminate the script, and no jobs
> >> get scheduled to pbs apparently.
> >>
> >> Am I hitting some limit? Is 5K jobs too much?
> >>
> >> How do I terminate swift now not to waste cycles of the head node?
> >>
> >> Thanks
> >> --
> >> Andriy Fedorov, Ph.D.
> >>
> >> Research Fellow
> >> Brigham and Women's Hospital
> >> Harvard Medical School
> >> 75 Francis Street
> >> Boston, MA 02115 USA
> >> fedorov at bwh.harvard.edu
> >> (617) 525-6258 (office)
> >> _______________________________________________
> >> Swift-user mailing list
> >> Swift-user at ci.uchicago.edu
> >> http://mail.ci.uchicago.edu/mailman/listinfo/swift-user
> >>
> >>
> >
> >
> >
> > --
> > Allan M. Espinosa <http://amespinosa.wordpress.com>
> > PhD student, Computer Science
> > University of Chicago <http://people.cs.uchicago.edu/~aespinosa>
> >
> _______________________________________________
> Swift-user mailing list
> Swift-user at ci.uchicago.edu
> http://mail.ci.uchicago.edu/mailman/listinfo/swift-user





More information about the Swift-user mailing list