[Swift-devel] coasters about half the jobs

Michael Wilde wilde at mcs.anl.gov
Fri Feb 18 21:45:13 CST 2011


Just tried this on Beagle with similar workload to the one that shoes the original problem.  I got:

Progress:  Stage in:2486  Submitting:14
Progress:  Stage in:1712  Submitting:787  Submitted:1
queuedsize > 0 but no job dequeued. Queued: {}
java.lang.Throwable
        at org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.requeueNonFitting(BlockQueueProcessor.java:253)

Logs are in:

login1$ cat out.pdb.all.00
Swift svn swift-r4061 (swift modified locally) cog-r3052 (cog modified locally)

Output on stdout/err is below.

Thanks!

Mike

RunID: 20110218-2137-v87vupcc
Progress:
SwiftScript trace: 10gs-1
SwiftScript trace: 1a1u-1
SwiftScript trace: 1m3g-1
SwiftScript trace: 1a1x-1
SwiftScript trace: 1a1m-1
SwiftScript trace: 1a12-1
SwiftScript trace: 1m62-1
SwiftScript trace: 1a22-1
SwiftScript trace: 121p-1
SwiftScript trace: 1a4p-1
SwiftScript trace: 1m6b-1
SwiftScript trace: 1m7b-1
SwiftScript trace: 1m9i-1
SwiftScript trace: 1mi1-1
SwiftScript trace: 1m6b-2
SwiftScript trace: 1a22-2
SwiftScript trace: 1mfg-1
SwiftScript trace: 1m9j-1
SwiftScript trace: 1a1w-1
SwiftScript trace: 1mdi-1
SwiftScript trace: 1mq1-1
SwiftScript trace: 1mp1-1
SwiftScript trace: 1mq0-1
SwiftScript trace: 1mk3-1
SwiftScript trace: 1mj4-1
SwiftScript trace: 1mil-1
SwiftScript trace: 1mr1-1
SwiftScript trace: 1nbq-1
SwiftScript trace: 1mr8-1
SwiftScript trace: 1mr1-2
SwiftScript trace: 1n4m-2
SwiftScript trace: 1n83-1
SwiftScript trace: 1mm2-1
SwiftScript trace: 1nd7-1
SwiftScript trace: 1nm8-1
SwiftScript trace: 1n4m-3
SwiftScript trace: 1nfi-2
SwiftScript trace: 1nou-2
SwiftScript trace: 1nou-1
SwiftScript trace: 1nfi-1
SwiftScript trace: 1o5e-1
SwiftScript trace: 1o6u-2
SwiftScript trace: 1nty-1
SwiftScript trace: 1mx3-1
SwiftScript trace: 1n3u-2
SwiftScript trace: 1muz-1
SwiftScript trace: 1o86-1
SwiftScript trace: 1n3u-1
SwiftScript trace: 1oa8-1
SwiftScript trace: 1oc0-1
Progress:  uninitialized:3
Progress:  Initializing:1311  Selecting site:1189
Progress:  Selecting site:2499  Initializing site shared directory:1
Progress:  Selecting site:2340  Initializing site shared directory:1  Stage in:159
Progress:  Stage in:2486  Submitting:14
Progress:  Stage in:1712  Submitting:787  Submitted:1
queuedsize > 0 but no job dequeued. Queued: {}
java.lang.Throwable
        at org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.requeueNonFitting(BlockQueueProcessor.java:253)
        at org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.updatePlan(BlockQueueProcessor.java:521)
        at org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.run(BlockQueueProcessor.java:109)
queuedsize > 0 but no job dequeued. Queued: {}
java.lang.Throwable
        at org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.requeueNonFitting(BlockQueueProcessor.java:253)
        at org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.updatePlan(BlockQueueProcessor.java:521)
        at org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.run(BlockQueueProcessor.java:109)
login1$ finger kelly


Logs are on CT net in /home/wilde/mp/mp04:
   cp ftdock-20110218-2137-v87vupcc.log out.pdb.all.00 ~/mp/mp04/

- Mike



----- Original Message -----
> There was a bug in the block allocation scheme that would cause blocks
> to be kept, in the long run, at about half of what would normally be
> necessary. This included shutting down perfectly good blocks that
> could
> be used for jobs. The effect was more dramatic when the maximum block
> size was 1.
> 
> I committed a fix for this in the stable branch (cog r3052). If you've
> experienced the above, you could give this a try. It would also be
> helpful if you gave it a try anyway, just to check if things are going
> ok.
> 
> Mihael
> 
> _______________________________________________
> Swift-devel mailing list
> Swift-devel at ci.uchicago.edu
> http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel

-- 
Michael Wilde
Computation Institute, University of Chicago
Mathematics and Computer Science Division
Argonne National Laboratory




More information about the Swift-devel mailing list