[Swift-devel] "Timer was cancelled" error on Beagle on Swift script termination

Ketan Maheshwari ketancmaheshwari at gmail.com
Mon Mar 28 13:01:27 CDT 2011


Mihael,

Can you give me a helping hand: I do not understand which task needs to be
canceled as it is just the beginning of job submission.

Ketan

On Mon, Mar 28, 2011 at 12:14 PM, Mihael Hategan <hategan at mcs.anl.gov>wrote:

> Good point. That behavior is pretty silly. One should be able to cancel
> a non-active task even if just for the reason that the coding involved
> in making sure that you only cancel an active task is unnecessarily
> complex.
>
> On Mon, 2011-03-28 at 12:05 -0500, Ketan Maheshwari wrote:
> > Hi,
> >
> > I checked out r3078. However, on beagle, I am getting another
> > exception: TaskSubmissionException, can only cancel an active task.
> >
> > Attached is the logfile and following is the exception stacktrace:
> >
> > ====
> >
> > [ketan at login1:pbs.run]$ sh run.sh
> > Swift svn swift-r4225 cog-r3078
> >
> > RunID: 20110328-1053-5q1fu8re
> > Progress:  time:0
> > SwiftScript trace: 1y4m-1
> > SwiftScript trace: 2day-1
> > SwiftScript trace: 2e5p-1
> > SwiftScript trace: 1y4m-2
> > SwiftScript trace: 2eaq-1
> > SwiftScript trace: 2dhy-1
> > SwiftScript trace: 1wxp-1
> > SwiftScript trace: 1j55-1
> > SwiftScript trace: 1jmt-1
> > SwiftScript trace: 1wf0-1
> > Failed to shut down block: Block 0328-541001-000000 (240x99940.000s)
> > org.globus.cog.abstraction.impl.common.task.TaskSubmissionException:
> > Can only cancel an active task
> >     at
> >
> org.globus.cog.abstraction.impl.scheduler.common.AbstractExecutor.cancel(AbstractExecutor.java:191)
> >     at
> >
> org.globus.cog.abstraction.impl.scheduler.common.AbstractJobSubmissionTaskHandler.cancel(AbstractJobSubmissionTaskHandler.java:85)
> >     at
> >
> org.globus.cog.abstraction.impl.common.AbstractTaskHandler.cancel(AbstractTaskHandler.java:69)
> >     at
> >
> org.globus.cog.abstraction.impl.common.task.ExecutionTaskHandler.cancel(ExecutionTaskHandler.java:102)
> >     at
> >
> org.globus.cog.abstraction.impl.common.task.ExecutionTaskHandler.cancel(ExecutionTaskHandler.java:91)
> >     at
> >
> org.globus.cog.abstraction.coaster.service.job.manager.BlockTaskSubmitter.cancel(BlockTaskSubmitter.java:45)
> >     at
> >
> org.globus.cog.abstraction.coaster.service.job.manager.Block.forceShutdown(Block.java:308)
> >     at
> >
> org.globus.cog.abstraction.coaster.service.job.manager.Block.shutdown(Block.java:288)
> >     at
> >
> org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.cleanDoneBlocks(BlockQueueProcessor.java:186)
> >     at
> >
> org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.updatePlan(BlockQueueProcessor.java:509)
> >     at
> >
> org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.run(BlockQueueProcessor.java:109)
> > Failed to shut down block: Block 0328-541001-000001 (240x99940.000s)
> > org.globus.cog.abstraction.impl.common.task.TaskSubmissionException:
> > Can only cancel an active task
> >     at
> >
> org.globus.cog.abstraction.impl.scheduler.common.AbstractExecutor.cancel(AbstractExecutor.java:191)
> >     at
> >
> org.globus.cog.abstraction.impl.scheduler.common.AbstractJobSubmissionTaskHandler.cancel(AbstractJobSubmissionTaskHandler.java:85)
> >     at
> >
> org.globus.cog.abstraction.impl.common.AbstractTaskHandler.cancel(AbstractTaskHandler.java:69)
> >     at
> >
> org.globus.cog.abstraction.impl.common.task.ExecutionTaskHandler.cancel(ExecutionTaskHandler.java:102)
> >     at
> >
> org.globus.cog.abstraction.impl.common.task.ExecutionTaskHandler.cancel(ExecutionTaskHandler.java:91)
> >     at
> >
> org.globus.cog.abstraction.coaster.service.job.manager.BlockTaskSubmitter.cancel(BlockTaskSubmitter.java:45)
> >     at
> >
> org.globus.cog.abstraction.coaster.service.job.manager.Block.forceShutdown(Block.java:308)
> >     at
> >
> org.globus.cog.abstraction.coaster.service.job.manager.Block.shutdown(Block.java:288)
> >     at
> >
> org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.cleanDoneBlocks(BlockQueueProcessor.java:186)
> >     at
> >
> org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.updatePlan(BlockQueueProcessor.java:509)
> >     at
> >
> org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.run(BlockQueueProcessor.java:109)
> > Failed to shut down block: Block 0328-541001-000002 (240x99940.000s)
> > ====
> >
> >
> >
> > On Mon, Mar 28, 2011 at 11:25 AM, Mihael Hategan <hategan at mcs.anl.gov>
> > wrote:
> >         Ok. Fixed in cog 4.1.8 r3078. An info message is now logged
> >         instead of
> >         the big nasty error.
> >
> >         Mihael
> >
> >
> >         On Mon, 2011-03-28 at 09:21 -0700, Mihael Hategan wrote:
> >         > Ok. So I'm gonna then go with timer thread killed during
> >         shutdown.
> >         >
> >         > Here's the relevant code in Timer.java:
> >         > public void run() {
> >         >         try {
> >         >             mainLoop();
> >         >         } finally {
> >         >             // Someone killed this Thread, behave as if
> >         Timer cancelled
> >         >             synchronized(queue) {
> >         >                 newTasksMayBeScheduled = false;
> >         > ...
> >         > private void sched(TimerTask task, long time, long period) {
> >         > //this is scheduleImpl in the IBM jvm
> >         >         if (time < 0)
> >         >             throw new IllegalArgumentException("Illegal
> >         execution
> >         > time.");
> >         >
> >         >         synchronized(queue) {
> >         >             if (!thread.newTasksMayBeScheduled)
> >         >                 throw new IllegalStateException("Timer
> >         already
> >         > cancelled.");
> >         > ...
> >         >
> >         > I guess the solution here is to ignore this error during
> >         shutdown and
> >         > simply not have timeouts.
> >         >
> >         > Mihael
> >         >
> >         > On Mon, 2011-03-28 at 10:05 -0500, Michael Wilde wrote:
> >         > > This was run on an 0.92 release modified to support
> >         Beagle. Code was build from ~wilde/swift/src/0.92
> >         > >
> >         > > We can/should try with plain 0.92,
> >         > > on both Beagle and vanilla linux
> >         > > building from both Beagle Java and Sun Java
> >         > >
> >         > > Those are the variables I can think of to get closer to
> >         the root cause.
> >         > >
> >         > > - Mike
> >         > >
> >         > > ----- Original Message -----
> >         > > > Hi Mihael, Mike,
> >         > > >
> >         > > > I think Mike had built a local Swift with pbs+coaster
> >         capabilities for
> >         > > > beagle. I am not sure if a clean install from repo has
> >         (if yes, I do
> >         > > > not know which rev) these capabilities.
> >         > > >
> >         > > > Ketan
> >         > > >
> >         > > >
> >         > > > On Mon, Mar 28, 2011 at 2:04 AM, Mihael Hategan <
> >         hategan at mcs.anl.gov
> >         > > > > wrote:
> >         > > >
> >         > > >
> >         > > > The IBM implementation seems to do pretty much the same
> >         thing as the
> >         > > > Sun
> >         > > > one. Which is that they never call .cancel() on a timer.
> >         > > >
> >         > > > So I don't understand what's happening here. I don't see
> >         any piece of
> >         > > > code that cancels that timer. Are you all using the same
> >         swift
> >         > > > installation? Can you try a clean install?
> >         > > >
> >         > > > Mihael
> >         > > >
> >         > > >
> >         > > >
> >         > > >
> >         > > > On Sun, 2011-03-27 at 20:12 -0500, Ketan Maheshwari
> >         wrote:
> >         > > > >
> >         > > > >
> >         > > > > On Sun, Mar 27, 2011 at 4:23 PM, Mihael Hategan <
> >         > > > > hategan at mcs.anl.gov >
> >         > > > > wrote:
> >         > > > > Actually I'm not so sure any more.
> >         > > > >
> >         > > > > My java Timer does not seem to have a scheduleImpl
> >         method.
> >         > > > > What version
> >         > > > > of java is this?
> >         > > > >
> >         > > > > On beagle it is java 1.6.0:
> >         > > > >
> >         > > > > [ketan at login2:~]$ java -version
> >         > > > > java version "1.6.0"
> >         > > > > Java(TM) SE Runtime Environment (build
> >         pxa6460sr9-20101125_01(SR9))
> >         > > > > IBM J9 VM (build 2.4, JRE 1.6.0 IBM J9 2.4 Linux
> >         amd64-64
> >         > > > > jvmxa6460sr9-20101124_69295 (JIT enabled, AOT enabled)
> >         > > > > J9VM - 20101124_069295
> >         > > > > JIT - r9_20101028_17488ifx2
> >         > > > > GC - 20101027_AA)
> >         > > > > JCL - 20101119_01
> >         > > > >
> >         > > > > ===
> >         > > > >
> >         > > > > --Ketan
> >         > > > >
> >         > >
> >         >
> >         >
> >
> >
> >         > _______________________________________________
> >         > Swift-devel mailing list
> >         > Swift-devel at ci.uchicago.edu
> >         > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
> >
> >
> >         _______________________________________________
> >         Swift-devel mailing list
> >         Swift-devel at ci.uchicago.edu
> >         http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
> >
> >
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/swift-devel/attachments/20110328/9069276f/attachment.html>


More information about the Swift-devel mailing list