Mihael,<br><br>Can you give me a helping hand: I do not understand which task needs to be canceled as it is just the beginning of job submission.<br><br>Ketan<br><br><div class="gmail_quote">On Mon, Mar 28, 2011 at 12:14 PM, Mihael Hategan <span dir="ltr"><<a href="mailto:hategan@mcs.anl.gov">hategan@mcs.anl.gov</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">Good point. That behavior is pretty silly. One should be able to cancel<br>
a non-active task even if just for the reason that the coding involved<br>
in making sure that you only cancel an active task is unnecessarily<br>
complex.<br>
<div><div></div><div class="h5"><br>
On Mon, 2011-03-28 at 12:05 -0500, Ketan Maheshwari wrote:<br>
> Hi,<br>
><br>
> I checked out r3078. However, on beagle, I am getting another<br>
> exception: TaskSubmissionException, can only cancel an active task.<br>
><br>
> Attached is the logfile and following is the exception stacktrace:<br>
><br>
> ====<br>
><br>
> [ketan@login1:pbs.run]$ sh run.sh<br>
> Swift svn swift-r4225 cog-r3078<br>
><br>
> RunID: 20110328-1053-5q1fu8re<br>
> Progress: time:0<br>
> SwiftScript trace: 1y4m-1<br>
> SwiftScript trace: 2day-1<br>
> SwiftScript trace: 2e5p-1<br>
> SwiftScript trace: 1y4m-2<br>
> SwiftScript trace: 2eaq-1<br>
> SwiftScript trace: 2dhy-1<br>
> SwiftScript trace: 1wxp-1<br>
> SwiftScript trace: 1j55-1<br>
> SwiftScript trace: 1jmt-1<br>
> SwiftScript trace: 1wf0-1<br>
> Failed to shut down block: Block 0328-541001-000000 (240x99940.000s)<br>
> org.globus.cog.abstraction.impl.common.task.TaskSubmissionException:<br>
> Can only cancel an active task<br>
> at<br>
> org.globus.cog.abstraction.impl.scheduler.common.AbstractExecutor.cancel(AbstractExecutor.java:191)<br>
> at<br>
> org.globus.cog.abstraction.impl.scheduler.common.AbstractJobSubmissionTaskHandler.cancel(AbstractJobSubmissionTaskHandler.java:85)<br>
> at<br>
> org.globus.cog.abstraction.impl.common.AbstractTaskHandler.cancel(AbstractTaskHandler.java:69)<br>
> at<br>
> org.globus.cog.abstraction.impl.common.task.ExecutionTaskHandler.cancel(ExecutionTaskHandler.java:102)<br>
> at<br>
> org.globus.cog.abstraction.impl.common.task.ExecutionTaskHandler.cancel(ExecutionTaskHandler.java:91)<br>
> at<br>
> org.globus.cog.abstraction.coaster.service.job.manager.BlockTaskSubmitter.cancel(BlockTaskSubmitter.java:45)<br>
> at<br>
> org.globus.cog.abstraction.coaster.service.job.manager.Block.forceShutdown(Block.java:308)<br>
> at<br>
> org.globus.cog.abstraction.coaster.service.job.manager.Block.shutdown(Block.java:288)<br>
> at<br>
> org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.cleanDoneBlocks(BlockQueueProcessor.java:186)<br>
> at<br>
> org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.updatePlan(BlockQueueProcessor.java:509)<br>
> at<br>
> org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.run(BlockQueueProcessor.java:109)<br>
> Failed to shut down block: Block 0328-541001-000001 (240x99940.000s)<br>
> org.globus.cog.abstraction.impl.common.task.TaskSubmissionException:<br>
> Can only cancel an active task<br>
> at<br>
> org.globus.cog.abstraction.impl.scheduler.common.AbstractExecutor.cancel(AbstractExecutor.java:191)<br>
> at<br>
> org.globus.cog.abstraction.impl.scheduler.common.AbstractJobSubmissionTaskHandler.cancel(AbstractJobSubmissionTaskHandler.java:85)<br>
> at<br>
> org.globus.cog.abstraction.impl.common.AbstractTaskHandler.cancel(AbstractTaskHandler.java:69)<br>
> at<br>
> org.globus.cog.abstraction.impl.common.task.ExecutionTaskHandler.cancel(ExecutionTaskHandler.java:102)<br>
> at<br>
> org.globus.cog.abstraction.impl.common.task.ExecutionTaskHandler.cancel(ExecutionTaskHandler.java:91)<br>
> at<br>
> org.globus.cog.abstraction.coaster.service.job.manager.BlockTaskSubmitter.cancel(BlockTaskSubmitter.java:45)<br>
> at<br>
> org.globus.cog.abstraction.coaster.service.job.manager.Block.forceShutdown(Block.java:308)<br>
> at<br>
> org.globus.cog.abstraction.coaster.service.job.manager.Block.shutdown(Block.java:288)<br>
> at<br>
> org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.cleanDoneBlocks(BlockQueueProcessor.java:186)<br>
> at<br>
> org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.updatePlan(BlockQueueProcessor.java:509)<br>
> at<br>
> org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.run(BlockQueueProcessor.java:109)<br>
> Failed to shut down block: Block 0328-541001-000002 (240x99940.000s)<br>
> ====<br>
><br>
><br>
><br>
> On Mon, Mar 28, 2011 at 11:25 AM, Mihael Hategan <<a href="mailto:hategan@mcs.anl.gov">hategan@mcs.anl.gov</a>><br>
> wrote:<br>
> Ok. Fixed in cog 4.1.8 r3078. An info message is now logged<br>
> instead of<br>
> the big nasty error.<br>
><br>
> Mihael<br>
><br>
><br>
> On Mon, 2011-03-28 at 09:21 -0700, Mihael Hategan wrote:<br>
> > Ok. So I'm gonna then go with timer thread killed during<br>
> shutdown.<br>
> ><br>
> > Here's the relevant code in Timer.java:<br>
> > public void run() {<br>
> > try {<br>
> > mainLoop();<br>
> > } finally {<br>
> > // Someone killed this Thread, behave as if<br>
> Timer cancelled<br>
> > synchronized(queue) {<br>
> > newTasksMayBeScheduled = false;<br>
> > ...<br>
> > private void sched(TimerTask task, long time, long period) {<br>
> > //this is scheduleImpl in the IBM jvm<br>
> > if (time < 0)<br>
> > throw new IllegalArgumentException("Illegal<br>
> execution<br>
> > time.");<br>
> ><br>
> > synchronized(queue) {<br>
> > if (!thread.newTasksMayBeScheduled)<br>
> > throw new IllegalStateException("Timer<br>
> already<br>
> > cancelled.");<br>
> > ...<br>
> ><br>
> > I guess the solution here is to ignore this error during<br>
> shutdown and<br>
> > simply not have timeouts.<br>
> ><br>
> > Mihael<br>
> ><br>
> > On Mon, 2011-03-28 at 10:05 -0500, Michael Wilde wrote:<br>
> > > This was run on an 0.92 release modified to support<br>
> Beagle. Code was build from ~wilde/swift/src/0.92<br>
> > ><br>
> > > We can/should try with plain 0.92,<br>
> > > on both Beagle and vanilla linux<br>
> > > building from both Beagle Java and Sun Java<br>
> > ><br>
> > > Those are the variables I can think of to get closer to<br>
> the root cause.<br>
> > ><br>
> > > - Mike<br>
> > ><br>
> > > ----- Original Message -----<br>
> > > > Hi Mihael, Mike,<br>
> > > ><br>
> > > > I think Mike had built a local Swift with pbs+coaster<br>
> capabilities for<br>
> > > > beagle. I am not sure if a clean install from repo has<br>
> (if yes, I do<br>
> > > > not know which rev) these capabilities.<br>
> > > ><br>
> > > > Ketan<br>
> > > ><br>
> > > ><br>
> > > > On Mon, Mar 28, 2011 at 2:04 AM, Mihael Hategan <<br>
> <a href="mailto:hategan@mcs.anl.gov">hategan@mcs.anl.gov</a><br>
> > > > > wrote:<br>
> > > ><br>
> > > ><br>
> > > > The IBM implementation seems to do pretty much the same<br>
> thing as the<br>
> > > > Sun<br>
> > > > one. Which is that they never call .cancel() on a timer.<br>
> > > ><br>
> > > > So I don't understand what's happening here. I don't see<br>
> any piece of<br>
> > > > code that cancels that timer. Are you all using the same<br>
> swift<br>
> > > > installation? Can you try a clean install?<br>
> > > ><br>
> > > > Mihael<br>
> > > ><br>
> > > ><br>
> > > ><br>
> > > ><br>
> > > > On Sun, 2011-03-27 at 20:12 -0500, Ketan Maheshwari<br>
> wrote:<br>
> > > > ><br>
> > > > ><br>
> > > > > On Sun, Mar 27, 2011 at 4:23 PM, Mihael Hategan <<br>
> > > > > <a href="mailto:hategan@mcs.anl.gov">hategan@mcs.anl.gov</a> ><br>
> > > > > wrote:<br>
> > > > > Actually I'm not so sure any more.<br>
> > > > ><br>
> > > > > My java Timer does not seem to have a scheduleImpl<br>
> method.<br>
> > > > > What version<br>
> > > > > of java is this?<br>
> > > > ><br>
> > > > > On beagle it is java 1.6.0:<br>
> > > > ><br>
> > > > > [ketan@login2:~]$ java -version<br>
> > > > > java version "1.6.0"<br>
> > > > > Java(TM) SE Runtime Environment (build<br>
> pxa6460sr9-20101125_01(SR9))<br>
> > > > > IBM J9 VM (build 2.4, JRE 1.6.0 IBM J9 2.4 Linux<br>
> amd64-64<br>
> > > > > jvmxa6460sr9-20101124_69295 (JIT enabled, AOT enabled)<br>
> > > > > J9VM - 20101124_069295<br>
> > > > > JIT - r9_20101028_17488ifx2<br>
> > > > > GC - 20101027_AA)<br>
> > > > > JCL - 20101119_01<br>
> > > > ><br>
> > > > > ===<br>
> > > > ><br>
> > > > > --Ketan<br>
> > > > ><br>
> > ><br>
> ><br>
> ><br>
><br>
><br>
> > _______________________________________________<br>
> > Swift-devel mailing list<br>
> > <a href="mailto:Swift-devel@ci.uchicago.edu">Swift-devel@ci.uchicago.edu</a><br>
> > <a href="http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel" target="_blank">http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel</a><br>
><br>
><br>
> _______________________________________________<br>
> Swift-devel mailing list<br>
> <a href="mailto:Swift-devel@ci.uchicago.edu">Swift-devel@ci.uchicago.edu</a><br>
> <a href="http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel" target="_blank">http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel</a><br>
><br>
><br>
<br>
<br>
</div></div></blockquote></div><br>