[Swift-devel] "Timer was cancelled" error on Beagle on Swift script termination
Ketan Maheshwari
ketancmaheshwari at gmail.com
Mon Mar 28 13:01:27 CDT 2011
Mihael,
Can you give me a helping hand: I do not understand which task needs to be
canceled as it is just the beginning of job submission.
Ketan
On Mon, Mar 28, 2011 at 12:14 PM, Mihael Hategan <hategan at mcs.anl.gov>wrote:
> Good point. That behavior is pretty silly. One should be able to cancel
> a non-active task even if just for the reason that the coding involved
> in making sure that you only cancel an active task is unnecessarily
> complex.
>
> On Mon, 2011-03-28 at 12:05 -0500, Ketan Maheshwari wrote:
> > Hi,
> >
> > I checked out r3078. However, on beagle, I am getting another
> > exception: TaskSubmissionException, can only cancel an active task.
> >
> > Attached is the logfile and following is the exception stacktrace:
> >
> > ====
> >
> > [ketan at login1:pbs.run]$ sh run.sh
> > Swift svn swift-r4225 cog-r3078
> >
> > RunID: 20110328-1053-5q1fu8re
> > Progress: time:0
> > SwiftScript trace: 1y4m-1
> > SwiftScript trace: 2day-1
> > SwiftScript trace: 2e5p-1
> > SwiftScript trace: 1y4m-2
> > SwiftScript trace: 2eaq-1
> > SwiftScript trace: 2dhy-1
> > SwiftScript trace: 1wxp-1
> > SwiftScript trace: 1j55-1
> > SwiftScript trace: 1jmt-1
> > SwiftScript trace: 1wf0-1
> > Failed to shut down block: Block 0328-541001-000000 (240x99940.000s)
> > org.globus.cog.abstraction.impl.common.task.TaskSubmissionException:
> > Can only cancel an active task
> > at
> >
> org.globus.cog.abstraction.impl.scheduler.common.AbstractExecutor.cancel(AbstractExecutor.java:191)
> > at
> >
> org.globus.cog.abstraction.impl.scheduler.common.AbstractJobSubmissionTaskHandler.cancel(AbstractJobSubmissionTaskHandler.java:85)
> > at
> >
> org.globus.cog.abstraction.impl.common.AbstractTaskHandler.cancel(AbstractTaskHandler.java:69)
> > at
> >
> org.globus.cog.abstraction.impl.common.task.ExecutionTaskHandler.cancel(ExecutionTaskHandler.java:102)
> > at
> >
> org.globus.cog.abstraction.impl.common.task.ExecutionTaskHandler.cancel(ExecutionTaskHandler.java:91)
> > at
> >
> org.globus.cog.abstraction.coaster.service.job.manager.BlockTaskSubmitter.cancel(BlockTaskSubmitter.java:45)
> > at
> >
> org.globus.cog.abstraction.coaster.service.job.manager.Block.forceShutdown(Block.java:308)
> > at
> >
> org.globus.cog.abstraction.coaster.service.job.manager.Block.shutdown(Block.java:288)
> > at
> >
> org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.cleanDoneBlocks(BlockQueueProcessor.java:186)
> > at
> >
> org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.updatePlan(BlockQueueProcessor.java:509)
> > at
> >
> org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.run(BlockQueueProcessor.java:109)
> > Failed to shut down block: Block 0328-541001-000001 (240x99940.000s)
> > org.globus.cog.abstraction.impl.common.task.TaskSubmissionException:
> > Can only cancel an active task
> > at
> >
> org.globus.cog.abstraction.impl.scheduler.common.AbstractExecutor.cancel(AbstractExecutor.java:191)
> > at
> >
> org.globus.cog.abstraction.impl.scheduler.common.AbstractJobSubmissionTaskHandler.cancel(AbstractJobSubmissionTaskHandler.java:85)
> > at
> >
> org.globus.cog.abstraction.impl.common.AbstractTaskHandler.cancel(AbstractTaskHandler.java:69)
> > at
> >
> org.globus.cog.abstraction.impl.common.task.ExecutionTaskHandler.cancel(ExecutionTaskHandler.java:102)
> > at
> >
> org.globus.cog.abstraction.impl.common.task.ExecutionTaskHandler.cancel(ExecutionTaskHandler.java:91)
> > at
> >
> org.globus.cog.abstraction.coaster.service.job.manager.BlockTaskSubmitter.cancel(BlockTaskSubmitter.java:45)
> > at
> >
> org.globus.cog.abstraction.coaster.service.job.manager.Block.forceShutdown(Block.java:308)
> > at
> >
> org.globus.cog.abstraction.coaster.service.job.manager.Block.shutdown(Block.java:288)
> > at
> >
> org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.cleanDoneBlocks(BlockQueueProcessor.java:186)
> > at
> >
> org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.updatePlan(BlockQueueProcessor.java:509)
> > at
> >
> org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.run(BlockQueueProcessor.java:109)
> > Failed to shut down block: Block 0328-541001-000002 (240x99940.000s)
> > ====
> >
> >
> >
> > On Mon, Mar 28, 2011 at 11:25 AM, Mihael Hategan <hategan at mcs.anl.gov>
> > wrote:
> > Ok. Fixed in cog 4.1.8 r3078. An info message is now logged
> > instead of
> > the big nasty error.
> >
> > Mihael
> >
> >
> > On Mon, 2011-03-28 at 09:21 -0700, Mihael Hategan wrote:
> > > Ok. So I'm gonna then go with timer thread killed during
> > shutdown.
> > >
> > > Here's the relevant code in Timer.java:
> > > public void run() {
> > > try {
> > > mainLoop();
> > > } finally {
> > > // Someone killed this Thread, behave as if
> > Timer cancelled
> > > synchronized(queue) {
> > > newTasksMayBeScheduled = false;
> > > ...
> > > private void sched(TimerTask task, long time, long period) {
> > > //this is scheduleImpl in the IBM jvm
> > > if (time < 0)
> > > throw new IllegalArgumentException("Illegal
> > execution
> > > time.");
> > >
> > > synchronized(queue) {
> > > if (!thread.newTasksMayBeScheduled)
> > > throw new IllegalStateException("Timer
> > already
> > > cancelled.");
> > > ...
> > >
> > > I guess the solution here is to ignore this error during
> > shutdown and
> > > simply not have timeouts.
> > >
> > > Mihael
> > >
> > > On Mon, 2011-03-28 at 10:05 -0500, Michael Wilde wrote:
> > > > This was run on an 0.92 release modified to support
> > Beagle. Code was build from ~wilde/swift/src/0.92
> > > >
> > > > We can/should try with plain 0.92,
> > > > on both Beagle and vanilla linux
> > > > building from both Beagle Java and Sun Java
> > > >
> > > > Those are the variables I can think of to get closer to
> > the root cause.
> > > >
> > > > - Mike
> > > >
> > > > ----- Original Message -----
> > > > > Hi Mihael, Mike,
> > > > >
> > > > > I think Mike had built a local Swift with pbs+coaster
> > capabilities for
> > > > > beagle. I am not sure if a clean install from repo has
> > (if yes, I do
> > > > > not know which rev) these capabilities.
> > > > >
> > > > > Ketan
> > > > >
> > > > >
> > > > > On Mon, Mar 28, 2011 at 2:04 AM, Mihael Hategan <
> > hategan at mcs.anl.gov
> > > > > > wrote:
> > > > >
> > > > >
> > > > > The IBM implementation seems to do pretty much the same
> > thing as the
> > > > > Sun
> > > > > one. Which is that they never call .cancel() on a timer.
> > > > >
> > > > > So I don't understand what's happening here. I don't see
> > any piece of
> > > > > code that cancels that timer. Are you all using the same
> > swift
> > > > > installation? Can you try a clean install?
> > > > >
> > > > > Mihael
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > On Sun, 2011-03-27 at 20:12 -0500, Ketan Maheshwari
> > wrote:
> > > > > >
> > > > > >
> > > > > > On Sun, Mar 27, 2011 at 4:23 PM, Mihael Hategan <
> > > > > > hategan at mcs.anl.gov >
> > > > > > wrote:
> > > > > > Actually I'm not so sure any more.
> > > > > >
> > > > > > My java Timer does not seem to have a scheduleImpl
> > method.
> > > > > > What version
> > > > > > of java is this?
> > > > > >
> > > > > > On beagle it is java 1.6.0:
> > > > > >
> > > > > > [ketan at login2:~]$ java -version
> > > > > > java version "1.6.0"
> > > > > > Java(TM) SE Runtime Environment (build
> > pxa6460sr9-20101125_01(SR9))
> > > > > > IBM J9 VM (build 2.4, JRE 1.6.0 IBM J9 2.4 Linux
> > amd64-64
> > > > > > jvmxa6460sr9-20101124_69295 (JIT enabled, AOT enabled)
> > > > > > J9VM - 20101124_069295
> > > > > > JIT - r9_20101028_17488ifx2
> > > > > > GC - 20101027_AA)
> > > > > > JCL - 20101119_01
> > > > > >
> > > > > > ===
> > > > > >
> > > > > > --Ketan
> > > > > >
> > > >
> > >
> > >
> >
> >
> > > _______________________________________________
> > > Swift-devel mailing list
> > > Swift-devel at ci.uchicago.edu
> > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
> >
> >
> > _______________________________________________
> > Swift-devel mailing list
> > Swift-devel at ci.uchicago.edu
> > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
> >
> >
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/swift-devel/attachments/20110328/9069276f/attachment.html>
More information about the Swift-devel
mailing list