[Swift-user] Getting swift to run on Fusion
David Kelly
davidk at ci.uchicago.edu
Wed Sep 12 11:20:39 CDT 2012
Jonathan,
Could you please provide a pointer to the log file that got created from this run?
Thanks,
David
----- Original Message -----
> From: "Jonathan Margoliash" <jmargolpeople at gmail.com>
> To: swift-user at ci.uchicago.edu, "Swift Language" <davidk at ci.uchicago.edu>, "Professor E. Yan" <eyan at anl.gov>
> Sent: Wednesday, September 12, 2012 10:50:35 AM
> Subject: Getting swift to run on Fusion
> Hello swift support,
>
>
> This is my first attempt getting swift to work on Fusion, and I'm
> getting the following output to the terminal:
>
>
> ------
>
>
>
> Warning: Function toint is deprecated, at line 10
> Swift trunk swift-r5882 cog-r3434
>
>
> RunID: 20120912-1032-5y7xb1ug
> Progress: time: Wed, 12 Sep 2012 10:32:51 -0500
> Progress: time: Wed, 12 Sep 2012 10:32:54 -0500 Selecting site:34
> Submitted:8
> Progress: time: Wed, 12 Sep 2012 10:32:57 -0500 Selecting site:34
> Submitted:8
> Progress: time: Wed, 12 Sep 2012 10:33:00 -0500 Selecting site:34
> Submitted:8
> ...
> Progress: time: Wed, 12 Sep 2012 10:40:33 -0500 Selecting site:34
> Submitted:8
> Failed to shut down block: Block 0912-321051-000005 (8x60.000s)
> org.globus.cog.abstraction.impl.common.task.TaskSubmissionException:
> Failed to cancel task. qdel returned with an exit code of 153
> at
> org.globus.cog.abstraction.impl.scheduler.common.AbstractExecutor.cancel(AbstractExecutor.java:205)
> at
> org.globus.cog.abstraction.impl.scheduler.common.AbstractJobSubmissionTaskHandler.cancel(AbstractJobSubmissionTaskHandler.java:85)
> at
> org.globus.cog.abstraction.impl.common.AbstractTaskHandler.cancel(AbstractTaskHandler.java:69)
> at
> org.globus.cog.abstraction.impl.common.task.ExecutionTaskHandler.cancel(ExecutionTaskHandler.java:102)
> at
> org.globus.cog.abstraction.impl.common.task.ExecutionTaskHandler.cancel(ExecutionTaskHandler.java:91)
> at
> org.globus.cog.abstraction.coaster.service.job.manager.BlockTaskSubmitter.cancel(BlockTaskSubmitter.java:46)
> at
> org.globus.cog.abstraction.coaster.service.job.manager.Block.forceShutdown(Block.java:320)
> at
> org.globus.cog.abstraction.coaster.service.job.manager.Node.errorReceived(Node.java:100)
> at
> org.globus.cog.karajan.workflow.service.commands.Command.errorReceived(Command.java:203)
> at
> org.globus.cog.karajan.workflow.service.channels.ChannelContext.notifyListeners(ChannelContext.java:237)
> at
> org.globus.cog.karajan.workflow.service.channels.ChannelContext.notifyRegisteredCommandsAndHandlers(ChannelContext.java:225)
> at
> org.globus.cog.karajan.workflow.service.channels.ChannelContext.channelShutDown(ChannelContext.java:318)
> at
> org.globus.cog.karajan.workflow.service.channels.ChannelManager.handleChannelException(ChannelManager.java:293)
> at
> org.globus.cog.karajan.workflow.service.channels.AbstractKarajanChannel.handleChannelException(AbstractKarajanChannel.java:552)
> at
> org.globus.cog.karajan.workflow.service.channels.NIOSender.run(NIOSender.java:140)
> Progress: time: Wed, 12 Sep 2012 10:40:36 -0500 Selecting site:34
> Submitted:8
> Progress: time: Wed, 12 Sep 2012 10:40:39 -0500 Selecting site:34
> Submitted:8
> Progress: time: Wed, 12 Sep 2012 10:40:42 -0500 Selecting site:34
> Submitted:8
> ...
>
> Progress: time: Wed, 12 Sep 2012 10:41:42 -0500 Selecting site:34
> Submitted:8
> Progress: time: Wed, 12 Sep 2012 10:41:45 -0500 Selecting site:34
> Submitted:8
> Failed to shut down block: Block 0912-321051-000006 (8x60.000s)
> org.globus.cog.abstraction.impl.common.task.TaskSubmissionException:
> Failed to cancel task. qdel returned with an exit code of 153
> at
> org.globus.cog.abstraction.impl.scheduler.common.AbstractExecutor.cancel(AbstractExecutor.java:205)
> at
> org.globus.cog.abstraction.impl.scheduler.common.AbstractJobSubmissionTaskHandler.cancel(AbstractJobSubmissionTaskHandler.java:85)
> at
> org.globus.cog.abstraction.impl.common.AbstractTaskHandler.cancel(AbstractTaskHandler.java:69)
> at
> org.globus.cog.abstraction.impl.common.task.ExecutionTaskHandler.cancel(ExecutionTaskHandler.java:102)
> at
> org.globus.cog.abstraction.impl.common.task.ExecutionTaskHandler.cancel(ExecutionTaskHandler.java:91)
> at
> org.globus.cog.abstraction.coaster.service.job.manager.BlockTaskSubmitter.cancel(BlockTaskSubmitter.java:46)
> at
> org.globus.cog.abstraction.coaster.service.job.manager.Block.forceShutdown(Block.java:320)
> at
> org.globus.cog.abstraction.coaster.service.job.manager.Node.errorReceived(Node.java:100)
> at
> org.globus.cog.karajan.workflow.service.commands.Command.errorReceived(Command.java:203)
> at
> org.globus.cog.karajan.workflow.service.channels.ChannelContext.notifyListeners(ChannelContext.java:237)
> at
> org.globus.cog.karajan.workflow.service.channels.ChannelContext.notifyRegisteredCommandsAndHandlers(ChannelContext.java:225)
> at
> org.globus.cog.karajan.workflow.service.channels.ChannelContext.channelShutDown(ChannelContext.java:318)
> at
> org.globus.cog.karajan.workflow.service.channels.ChannelManager.handleChannelException(ChannelManager.java:293)
> at
> org.globus.cog.karajan.workflow.service.channels.AbstractKarajanChannel.handleChannelException(AbstractKarajanChannel.java:552)
> at
> org.globus.cog.karajan.workflow.service.channels.NIOSender.run(NIOSender.java:140)
> Progress: time: Wed, 12 Sep 2012 10:41:48 -0500 Selecting site:34
> Submitted:8
> Progress: time: Wed, 12 Sep 2012 10:41:51 -0500 Selecting site:34
> Submitted:8
> Progress: time: Wed, 12 Sep 2012 10:41:54 -0500 Selecting site:34
> Submitted:8
> ...
>
>
> ------
>
>
> I understand the long lines of unchanging "Progress: ..." reports -
> the shared queue is busy, and so I am not expecting my job to be
> executed right away. However, I don't understand why I'm getting these
> "failed to cancel task" errors. I gave each individual app well more
> than enough time for it to run to completion. And while I set the
> timelimit on the entire process to be much smaller than it needs
> (<profile namespace="globus" key="maxTime">60</profile> in sites.xml,
> when the process could run for days)
> I presumed the entire process would just get shut down after 60
> seconds of runtime. Why is this cropping up? Thanks,
>
>
> Jonathan
More information about the Swift-user
mailing list