[Swift-devel] "Timer was cancelled" error on Beagle on Swift script termination

Mihael Hategan hategan at mcs.anl.gov
Mon Mar 28 12:14:26 CDT 2011


Good point. That behavior is pretty silly. One should be able to cancel
a non-active task even if just for the reason that the coding involved
in making sure that you only cancel an active task is unnecessarily
complex.

On Mon, 2011-03-28 at 12:05 -0500, Ketan Maheshwari wrote:
> Hi,
> 
> I checked out r3078. However, on beagle, I am getting another
> exception: TaskSubmissionException, can only cancel an active task.
> 
> Attached is the logfile and following is the exception stacktrace:
> 
> ====
> 
> [ketan at login1:pbs.run]$ sh run.sh 
> Swift svn swift-r4225 cog-r3078
> 
> RunID: 20110328-1053-5q1fu8re
> Progress:  time:0
> SwiftScript trace: 1y4m-1
> SwiftScript trace: 2day-1
> SwiftScript trace: 2e5p-1
> SwiftScript trace: 1y4m-2
> SwiftScript trace: 2eaq-1
> SwiftScript trace: 2dhy-1
> SwiftScript trace: 1wxp-1
> SwiftScript trace: 1j55-1
> SwiftScript trace: 1jmt-1
> SwiftScript trace: 1wf0-1
> Failed to shut down block: Block 0328-541001-000000 (240x99940.000s)
> org.globus.cog.abstraction.impl.common.task.TaskSubmissionException:
> Can only cancel an active task
>     at
> org.globus.cog.abstraction.impl.scheduler.common.AbstractExecutor.cancel(AbstractExecutor.java:191)
>     at
> org.globus.cog.abstraction.impl.scheduler.common.AbstractJobSubmissionTaskHandler.cancel(AbstractJobSubmissionTaskHandler.java:85)
>     at
> org.globus.cog.abstraction.impl.common.AbstractTaskHandler.cancel(AbstractTaskHandler.java:69)
>     at
> org.globus.cog.abstraction.impl.common.task.ExecutionTaskHandler.cancel(ExecutionTaskHandler.java:102)
>     at
> org.globus.cog.abstraction.impl.common.task.ExecutionTaskHandler.cancel(ExecutionTaskHandler.java:91)
>     at
> org.globus.cog.abstraction.coaster.service.job.manager.BlockTaskSubmitter.cancel(BlockTaskSubmitter.java:45)
>     at
> org.globus.cog.abstraction.coaster.service.job.manager.Block.forceShutdown(Block.java:308)
>     at
> org.globus.cog.abstraction.coaster.service.job.manager.Block.shutdown(Block.java:288)
>     at
> org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.cleanDoneBlocks(BlockQueueProcessor.java:186)
>     at
> org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.updatePlan(BlockQueueProcessor.java:509)
>     at
> org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.run(BlockQueueProcessor.java:109)
> Failed to shut down block: Block 0328-541001-000001 (240x99940.000s)
> org.globus.cog.abstraction.impl.common.task.TaskSubmissionException:
> Can only cancel an active task
>     at
> org.globus.cog.abstraction.impl.scheduler.common.AbstractExecutor.cancel(AbstractExecutor.java:191)
>     at
> org.globus.cog.abstraction.impl.scheduler.common.AbstractJobSubmissionTaskHandler.cancel(AbstractJobSubmissionTaskHandler.java:85)
>     at
> org.globus.cog.abstraction.impl.common.AbstractTaskHandler.cancel(AbstractTaskHandler.java:69)
>     at
> org.globus.cog.abstraction.impl.common.task.ExecutionTaskHandler.cancel(ExecutionTaskHandler.java:102)
>     at
> org.globus.cog.abstraction.impl.common.task.ExecutionTaskHandler.cancel(ExecutionTaskHandler.java:91)
>     at
> org.globus.cog.abstraction.coaster.service.job.manager.BlockTaskSubmitter.cancel(BlockTaskSubmitter.java:45)
>     at
> org.globus.cog.abstraction.coaster.service.job.manager.Block.forceShutdown(Block.java:308)
>     at
> org.globus.cog.abstraction.coaster.service.job.manager.Block.shutdown(Block.java:288)
>     at
> org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.cleanDoneBlocks(BlockQueueProcessor.java:186)
>     at
> org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.updatePlan(BlockQueueProcessor.java:509)
>     at
> org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.run(BlockQueueProcessor.java:109)
> Failed to shut down block: Block 0328-541001-000002 (240x99940.000s)
> ====
> 
> 
> 
> On Mon, Mar 28, 2011 at 11:25 AM, Mihael Hategan <hategan at mcs.anl.gov>
> wrote:
>         Ok. Fixed in cog 4.1.8 r3078. An info message is now logged
>         instead of
>         the big nasty error.
>         
>         Mihael
>         
>         
>         On Mon, 2011-03-28 at 09:21 -0700, Mihael Hategan wrote:
>         > Ok. So I'm gonna then go with timer thread killed during
>         shutdown.
>         >
>         > Here's the relevant code in Timer.java:
>         > public void run() {
>         >         try {
>         >             mainLoop();
>         >         } finally {
>         >             // Someone killed this Thread, behave as if
>         Timer cancelled
>         >             synchronized(queue) {
>         >                 newTasksMayBeScheduled = false;
>         > ...
>         > private void sched(TimerTask task, long time, long period) {
>         > //this is scheduleImpl in the IBM jvm
>         >         if (time < 0)
>         >             throw new IllegalArgumentException("Illegal
>         execution
>         > time.");
>         >
>         >         synchronized(queue) {
>         >             if (!thread.newTasksMayBeScheduled)
>         >                 throw new IllegalStateException("Timer
>         already
>         > cancelled.");
>         > ...
>         >
>         > I guess the solution here is to ignore this error during
>         shutdown and
>         > simply not have timeouts.
>         >
>         > Mihael
>         >
>         > On Mon, 2011-03-28 at 10:05 -0500, Michael Wilde wrote:
>         > > This was run on an 0.92 release modified to support
>         Beagle. Code was build from ~wilde/swift/src/0.92
>         > >
>         > > We can/should try with plain 0.92,
>         > > on both Beagle and vanilla linux
>         > > building from both Beagle Java and Sun Java
>         > >
>         > > Those are the variables I can think of to get closer to
>         the root cause.
>         > >
>         > > - Mike
>         > >
>         > > ----- Original Message -----
>         > > > Hi Mihael, Mike,
>         > > >
>         > > > I think Mike had built a local Swift with pbs+coaster
>         capabilities for
>         > > > beagle. I am not sure if a clean install from repo has
>         (if yes, I do
>         > > > not know which rev) these capabilities.
>         > > >
>         > > > Ketan
>         > > >
>         > > >
>         > > > On Mon, Mar 28, 2011 at 2:04 AM, Mihael Hategan <
>         hategan at mcs.anl.gov
>         > > > > wrote:
>         > > >
>         > > >
>         > > > The IBM implementation seems to do pretty much the same
>         thing as the
>         > > > Sun
>         > > > one. Which is that they never call .cancel() on a timer.
>         > > >
>         > > > So I don't understand what's happening here. I don't see
>         any piece of
>         > > > code that cancels that timer. Are you all using the same
>         swift
>         > > > installation? Can you try a clean install?
>         > > >
>         > > > Mihael
>         > > >
>         > > >
>         > > >
>         > > >
>         > > > On Sun, 2011-03-27 at 20:12 -0500, Ketan Maheshwari
>         wrote:
>         > > > >
>         > > > >
>         > > > > On Sun, Mar 27, 2011 at 4:23 PM, Mihael Hategan <
>         > > > > hategan at mcs.anl.gov >
>         > > > > wrote:
>         > > > > Actually I'm not so sure any more.
>         > > > >
>         > > > > My java Timer does not seem to have a scheduleImpl
>         method.
>         > > > > What version
>         > > > > of java is this?
>         > > > >
>         > > > > On beagle it is java 1.6.0:
>         > > > >
>         > > > > [ketan at login2:~]$ java -version
>         > > > > java version "1.6.0"
>         > > > > Java(TM) SE Runtime Environment (build
>         pxa6460sr9-20101125_01(SR9))
>         > > > > IBM J9 VM (build 2.4, JRE 1.6.0 IBM J9 2.4 Linux
>         amd64-64
>         > > > > jvmxa6460sr9-20101124_69295 (JIT enabled, AOT enabled)
>         > > > > J9VM - 20101124_069295
>         > > > > JIT - r9_20101028_17488ifx2
>         > > > > GC - 20101027_AA)
>         > > > > JCL - 20101119_01
>         > > > >
>         > > > > ===
>         > > > >
>         > > > > --Ketan
>         > > > >
>         > >
>         >
>         >
>         
>         
>         > _______________________________________________
>         > Swift-devel mailing list
>         > Swift-devel at ci.uchicago.edu
>         > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
>         
>         
>         _______________________________________________
>         Swift-devel mailing list
>         Swift-devel at ci.uchicago.edu
>         http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
>         
> 





More information about the Swift-devel mailing list