[Swift-user] Getting swift to run on Fusion

Jonathan Margoliash jmargolpeople at gmail.com
Wed Sep 12 10:50:35 CDT 2012


Hello swift support,

This is my first attempt getting swift to work on Fusion, and I'm getting
the following output to the terminal:

------

Warning: Function toint is deprecated, at line 10
Swift trunk swift-r5882 cog-r3434

RunID: 20120912-1032-5y7xb1ug
Progress:  time: Wed, 12 Sep 2012 10:32:51 -0500
Progress:  time: Wed, 12 Sep 2012 10:32:54 -0500  Selecting site:34
 Submitted:8
Progress:  time: Wed, 12 Sep 2012 10:32:57 -0500  Selecting site:34
 Submitted:8
Progress:  time: Wed, 12 Sep 2012 10:33:00 -0500  Selecting site:34
 Submitted:8
...
Progress:  time: Wed, 12 Sep 2012 10:40:33 -0500  Selecting site:34
 Submitted:8
Failed to shut down block: Block 0912-321051-000005 (8x60.000s)
org.globus.cog.abstraction.impl.common.task.TaskSubmissionException: Failed
to cancel task. qdel returned with an exit code of 153
at
org.globus.cog.abstraction.impl.scheduler.common.AbstractExecutor.cancel(AbstractExecutor.java:205)
at
org.globus.cog.abstraction.impl.scheduler.common.AbstractJobSubmissionTaskHandler.cancel(AbstractJobSubmissionTaskHandler.java:85)
at
org.globus.cog.abstraction.impl.common.AbstractTaskHandler.cancel(AbstractTaskHandler.java:69)
at
org.globus.cog.abstraction.impl.common.task.ExecutionTaskHandler.cancel(ExecutionTaskHandler.java:102)
at
org.globus.cog.abstraction.impl.common.task.ExecutionTaskHandler.cancel(ExecutionTaskHandler.java:91)
at
org.globus.cog.abstraction.coaster.service.job.manager.BlockTaskSubmitter.cancel(BlockTaskSubmitter.java:46)
at
org.globus.cog.abstraction.coaster.service.job.manager.Block.forceShutdown(Block.java:320)
at
org.globus.cog.abstraction.coaster.service.job.manager.Node.errorReceived(Node.java:100)
at
org.globus.cog.karajan.workflow.service.commands.Command.errorReceived(Command.java:203)
at
org.globus.cog.karajan.workflow.service.channels.ChannelContext.notifyListeners(ChannelContext.java:237)
at
org.globus.cog.karajan.workflow.service.channels.ChannelContext.notifyRegisteredCommandsAndHandlers(ChannelContext.java:225)
at
org.globus.cog.karajan.workflow.service.channels.ChannelContext.channelShutDown(ChannelContext.java:318)
at
org.globus.cog.karajan.workflow.service.channels.ChannelManager.handleChannelException(ChannelManager.java:293)
at
org.globus.cog.karajan.workflow.service.channels.AbstractKarajanChannel.handleChannelException(AbstractKarajanChannel.java:552)
at
org.globus.cog.karajan.workflow.service.channels.NIOSender.run(NIOSender.java:140)
Progress:  time: Wed, 12 Sep 2012 10:40:36 -0500  Selecting site:34
 Submitted:8
Progress:  time: Wed, 12 Sep 2012 10:40:39 -0500  Selecting site:34
 Submitted:8
Progress:  time: Wed, 12 Sep 2012 10:40:42 -0500  Selecting site:34
 Submitted:8
...
Progress:  time: Wed, 12 Sep 2012 10:41:42 -0500  Selecting site:34
 Submitted:8
Progress:  time: Wed, 12 Sep 2012 10:41:45 -0500  Selecting site:34
 Submitted:8
Failed to shut down block: Block 0912-321051-000006 (8x60.000s)
org.globus.cog.abstraction.impl.common.task.TaskSubmissionException: Failed
to cancel task. qdel returned with an exit code of 153
at
org.globus.cog.abstraction.impl.scheduler.common.AbstractExecutor.cancel(AbstractExecutor.java:205)
at
org.globus.cog.abstraction.impl.scheduler.common.AbstractJobSubmissionTaskHandler.cancel(AbstractJobSubmissionTaskHandler.java:85)
at
org.globus.cog.abstraction.impl.common.AbstractTaskHandler.cancel(AbstractTaskHandler.java:69)
at
org.globus.cog.abstraction.impl.common.task.ExecutionTaskHandler.cancel(ExecutionTaskHandler.java:102)
at
org.globus.cog.abstraction.impl.common.task.ExecutionTaskHandler.cancel(ExecutionTaskHandler.java:91)
at
org.globus.cog.abstraction.coaster.service.job.manager.BlockTaskSubmitter.cancel(BlockTaskSubmitter.java:46)
at
org.globus.cog.abstraction.coaster.service.job.manager.Block.forceShutdown(Block.java:320)
at
org.globus.cog.abstraction.coaster.service.job.manager.Node.errorReceived(Node.java:100)
at
org.globus.cog.karajan.workflow.service.commands.Command.errorReceived(Command.java:203)
at
org.globus.cog.karajan.workflow.service.channels.ChannelContext.notifyListeners(ChannelContext.java:237)
at
org.globus.cog.karajan.workflow.service.channels.ChannelContext.notifyRegisteredCommandsAndHandlers(ChannelContext.java:225)
at
org.globus.cog.karajan.workflow.service.channels.ChannelContext.channelShutDown(ChannelContext.java:318)
at
org.globus.cog.karajan.workflow.service.channels.ChannelManager.handleChannelException(ChannelManager.java:293)
at
org.globus.cog.karajan.workflow.service.channels.AbstractKarajanChannel.handleChannelException(AbstractKarajanChannel.java:552)
at
org.globus.cog.karajan.workflow.service.channels.NIOSender.run(NIOSender.java:140)
Progress:  time: Wed, 12 Sep 2012 10:41:48 -0500  Selecting site:34
 Submitted:8
Progress:  time: Wed, 12 Sep 2012 10:41:51 -0500  Selecting site:34
 Submitted:8
Progress:  time: Wed, 12 Sep 2012 10:41:54 -0500  Selecting site:34
 Submitted:8
...

------

I understand the long lines of unchanging "Progress: ..." reports - the
shared queue is busy, and so I am not expecting my job to be executed right
away. However, I don't understand why I'm getting these "failed to cancel
task" errors. I gave each individual app well more than enough time for it
to run to completion. And while I set the timelimit on the entire process
to be much smaller than it needs
(<profile namespace="globus" key="maxTime">60</profile> in sites.xml, when
the process could run for days)
I presumed the entire process would just get shut down after 60 seconds of
runtime. Why is this cropping up? Thanks,

Jonathan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/swift-user/attachments/20120912/18bd697b/attachment.html>


More information about the Swift-user mailing list