[Swift-devel] coasters about half the jobs
Michael Wilde
wilde at mcs.anl.gov
Fri Feb 18 21:45:13 CST 2011
Just tried this on Beagle with similar workload to the one that shoes the original problem. I got:
Progress: Stage in:2486 Submitting:14
Progress: Stage in:1712 Submitting:787 Submitted:1
queuedsize > 0 but no job dequeued. Queued: {}
java.lang.Throwable
at org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.requeueNonFitting(BlockQueueProcessor.java:253)
Logs are in:
login1$ cat out.pdb.all.00
Swift svn swift-r4061 (swift modified locally) cog-r3052 (cog modified locally)
Output on stdout/err is below.
Thanks!
Mike
RunID: 20110218-2137-v87vupcc
Progress:
SwiftScript trace: 10gs-1
SwiftScript trace: 1a1u-1
SwiftScript trace: 1m3g-1
SwiftScript trace: 1a1x-1
SwiftScript trace: 1a1m-1
SwiftScript trace: 1a12-1
SwiftScript trace: 1m62-1
SwiftScript trace: 1a22-1
SwiftScript trace: 121p-1
SwiftScript trace: 1a4p-1
SwiftScript trace: 1m6b-1
SwiftScript trace: 1m7b-1
SwiftScript trace: 1m9i-1
SwiftScript trace: 1mi1-1
SwiftScript trace: 1m6b-2
SwiftScript trace: 1a22-2
SwiftScript trace: 1mfg-1
SwiftScript trace: 1m9j-1
SwiftScript trace: 1a1w-1
SwiftScript trace: 1mdi-1
SwiftScript trace: 1mq1-1
SwiftScript trace: 1mp1-1
SwiftScript trace: 1mq0-1
SwiftScript trace: 1mk3-1
SwiftScript trace: 1mj4-1
SwiftScript trace: 1mil-1
SwiftScript trace: 1mr1-1
SwiftScript trace: 1nbq-1
SwiftScript trace: 1mr8-1
SwiftScript trace: 1mr1-2
SwiftScript trace: 1n4m-2
SwiftScript trace: 1n83-1
SwiftScript trace: 1mm2-1
SwiftScript trace: 1nd7-1
SwiftScript trace: 1nm8-1
SwiftScript trace: 1n4m-3
SwiftScript trace: 1nfi-2
SwiftScript trace: 1nou-2
SwiftScript trace: 1nou-1
SwiftScript trace: 1nfi-1
SwiftScript trace: 1o5e-1
SwiftScript trace: 1o6u-2
SwiftScript trace: 1nty-1
SwiftScript trace: 1mx3-1
SwiftScript trace: 1n3u-2
SwiftScript trace: 1muz-1
SwiftScript trace: 1o86-1
SwiftScript trace: 1n3u-1
SwiftScript trace: 1oa8-1
SwiftScript trace: 1oc0-1
Progress: uninitialized:3
Progress: Initializing:1311 Selecting site:1189
Progress: Selecting site:2499 Initializing site shared directory:1
Progress: Selecting site:2340 Initializing site shared directory:1 Stage in:159
Progress: Stage in:2486 Submitting:14
Progress: Stage in:1712 Submitting:787 Submitted:1
queuedsize > 0 but no job dequeued. Queued: {}
java.lang.Throwable
at org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.requeueNonFitting(BlockQueueProcessor.java:253)
at org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.updatePlan(BlockQueueProcessor.java:521)
at org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.run(BlockQueueProcessor.java:109)
queuedsize > 0 but no job dequeued. Queued: {}
java.lang.Throwable
at org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.requeueNonFitting(BlockQueueProcessor.java:253)
at org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.updatePlan(BlockQueueProcessor.java:521)
at org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.run(BlockQueueProcessor.java:109)
login1$ finger kelly
Logs are on CT net in /home/wilde/mp/mp04:
cp ftdock-20110218-2137-v87vupcc.log out.pdb.all.00 ~/mp/mp04/
- Mike
----- Original Message -----
> There was a bug in the block allocation scheme that would cause blocks
> to be kept, in the long run, at about half of what would normally be
> necessary. This included shutting down perfectly good blocks that
> could
> be used for jobs. The effect was more dramatic when the maximum block
> size was 1.
>
> I committed a fix for this in the stable branch (cog r3052). If you've
> experienced the above, you could give this a try. It would also be
> helpful if you gave it a try anyway, just to check if things are going
> ok.
>
> Mihael
>
> _______________________________________________
> Swift-devel mailing list
> Swift-devel at ci.uchicago.edu
> http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
--
Michael Wilde
Computation Institute, University of Chicago
Mathematics and Computer Science Division
Argonne National Laboratory
More information about the Swift-devel
mailing list