I attached the .0.rlog, .log, .d and the swift.log files. Which of those files do you use for debugging? And these files are all located in the directory <div><br></div><div>/home/jmargoliash/my_SwiftSCE2_branch_matlab/runs/run-20120912-103235</div>
<div><br></div><div>on Fusion, if that's what you were asking for. Thanks!</div><div><br></div><div>Jonathan<br><br><div class="gmail_quote">On Wed, Sep 12, 2012 at 12:20 PM, David Kelly <span dir="ltr"><<a href="mailto:davidk@ci.uchicago.edu" target="_blank">davidk@ci.uchicago.edu</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Jonathan,<br>
<br>
Could you please provide a pointer to the log file that got created from this run?<br>
<br>
Thanks,<br>
David<br>
<div class="HOEnZb"><div class="h5"><br>
----- Original Message -----<br>
> From: "Jonathan Margoliash" <<a href="mailto:jmargolpeople@gmail.com">jmargolpeople@gmail.com</a>><br>
> To: <a href="mailto:swift-user@ci.uchicago.edu">swift-user@ci.uchicago.edu</a>, "Swift Language" <<a href="mailto:davidk@ci.uchicago.edu">davidk@ci.uchicago.edu</a>>, "Professor E. Yan" <<a href="mailto:eyan@anl.gov">eyan@anl.gov</a>><br>

> Sent: Wednesday, September 12, 2012 10:50:35 AM<br>
> Subject: Getting swift to run on Fusion<br>
> Hello swift support,<br>
><br>
><br>
> This is my first attempt getting swift to work on Fusion, and I'm<br>
> getting the following output to the terminal:<br>
><br>
><br>
> ------<br>
><br>
><br>
><br>
> Warning: Function toint is deprecated, at line 10<br>
> Swift trunk swift-r5882 cog-r3434<br>
><br>
><br>
> RunID: 20120912-1032-5y7xb1ug<br>
> Progress: time: Wed, 12 Sep 2012 10:32:51 -0500<br>
> Progress: time: Wed, 12 Sep 2012 10:32:54 -0500 Selecting site:34<br>
> Submitted:8<br>
> Progress: time: Wed, 12 Sep 2012 10:32:57 -0500 Selecting site:34<br>
> Submitted:8<br>
> Progress: time: Wed, 12 Sep 2012 10:33:00 -0500 Selecting site:34<br>
> Submitted:8<br>
> ...<br>
> Progress: time: Wed, 12 Sep 2012 10:40:33 -0500 Selecting site:34<br>
> Submitted:8<br>
> Failed to shut down block: Block 0912-321051-000005 (8x60.000s)<br>
> org.globus.cog.abstraction.impl.common.task.TaskSubmissionException:<br>
> Failed to cancel task. qdel returned with an exit code of 153<br>
> at<br>
> org.globus.cog.abstraction.impl.scheduler.common.AbstractExecutor.cancel(AbstractExecutor.java:205)<br>
> at<br>
> org.globus.cog.abstraction.impl.scheduler.common.AbstractJobSubmissionTaskHandler.cancel(AbstractJobSubmissionTaskHandler.java:85)<br>
> at<br>
> org.globus.cog.abstraction.impl.common.AbstractTaskHandler.cancel(AbstractTaskHandler.java:69)<br>
> at<br>
> org.globus.cog.abstraction.impl.common.task.ExecutionTaskHandler.cancel(ExecutionTaskHandler.java:102)<br>
> at<br>
> org.globus.cog.abstraction.impl.common.task.ExecutionTaskHandler.cancel(ExecutionTaskHandler.java:91)<br>
> at<br>
> org.globus.cog.abstraction.coaster.service.job.manager.BlockTaskSubmitter.cancel(BlockTaskSubmitter.java:46)<br>
> at<br>
> org.globus.cog.abstraction.coaster.service.job.manager.Block.forceShutdown(Block.java:320)<br>
> at<br>
> org.globus.cog.abstraction.coaster.service.job.manager.Node.errorReceived(Node.java:100)<br>
> at<br>
> org.globus.cog.karajan.workflow.service.commands.Command.errorReceived(Command.java:203)<br>
> at<br>
> org.globus.cog.karajan.workflow.service.channels.ChannelContext.notifyListeners(ChannelContext.java:237)<br>
> at<br>
> org.globus.cog.karajan.workflow.service.channels.ChannelContext.notifyRegisteredCommandsAndHandlers(ChannelContext.java:225)<br>
> at<br>
> org.globus.cog.karajan.workflow.service.channels.ChannelContext.channelShutDown(ChannelContext.java:318)<br>
> at<br>
> org.globus.cog.karajan.workflow.service.channels.ChannelManager.handleChannelException(ChannelManager.java:293)<br>
> at<br>
> org.globus.cog.karajan.workflow.service.channels.AbstractKarajanChannel.handleChannelException(AbstractKarajanChannel.java:552)<br>
> at<br>
> org.globus.cog.karajan.workflow.service.channels.NIOSender.run(NIOSender.java:140)<br>
> Progress: time: Wed, 12 Sep 2012 10:40:36 -0500 Selecting site:34<br>
> Submitted:8<br>
> Progress: time: Wed, 12 Sep 2012 10:40:39 -0500 Selecting site:34<br>
> Submitted:8<br>
> Progress: time: Wed, 12 Sep 2012 10:40:42 -0500 Selecting site:34<br>
> Submitted:8<br>
> ...<br>
><br>
> Progress: time: Wed, 12 Sep 2012 10:41:42 -0500 Selecting site:34<br>
> Submitted:8<br>
> Progress: time: Wed, 12 Sep 2012 10:41:45 -0500 Selecting site:34<br>
> Submitted:8<br>
> Failed to shut down block: Block 0912-321051-000006 (8x60.000s)<br>
> org.globus.cog.abstraction.impl.common.task.TaskSubmissionException:<br>
> Failed to cancel task. qdel returned with an exit code of 153<br>
> at<br>
> org.globus.cog.abstraction.impl.scheduler.common.AbstractExecutor.cancel(AbstractExecutor.java:205)<br>
> at<br>
> org.globus.cog.abstraction.impl.scheduler.common.AbstractJobSubmissionTaskHandler.cancel(AbstractJobSubmissionTaskHandler.java:85)<br>
> at<br>
> org.globus.cog.abstraction.impl.common.AbstractTaskHandler.cancel(AbstractTaskHandler.java:69)<br>
> at<br>
> org.globus.cog.abstraction.impl.common.task.ExecutionTaskHandler.cancel(ExecutionTaskHandler.java:102)<br>
> at<br>
> org.globus.cog.abstraction.impl.common.task.ExecutionTaskHandler.cancel(ExecutionTaskHandler.java:91)<br>
> at<br>
> org.globus.cog.abstraction.coaster.service.job.manager.BlockTaskSubmitter.cancel(BlockTaskSubmitter.java:46)<br>
> at<br>
> org.globus.cog.abstraction.coaster.service.job.manager.Block.forceShutdown(Block.java:320)<br>
> at<br>
> org.globus.cog.abstraction.coaster.service.job.manager.Node.errorReceived(Node.java:100)<br>
> at<br>
> org.globus.cog.karajan.workflow.service.commands.Command.errorReceived(Command.java:203)<br>
> at<br>
> org.globus.cog.karajan.workflow.service.channels.ChannelContext.notifyListeners(ChannelContext.java:237)<br>
> at<br>
> org.globus.cog.karajan.workflow.service.channels.ChannelContext.notifyRegisteredCommandsAndHandlers(ChannelContext.java:225)<br>
> at<br>
> org.globus.cog.karajan.workflow.service.channels.ChannelContext.channelShutDown(ChannelContext.java:318)<br>
> at<br>
> org.globus.cog.karajan.workflow.service.channels.ChannelManager.handleChannelException(ChannelManager.java:293)<br>
> at<br>
> org.globus.cog.karajan.workflow.service.channels.AbstractKarajanChannel.handleChannelException(AbstractKarajanChannel.java:552)<br>
> at<br>
> org.globus.cog.karajan.workflow.service.channels.NIOSender.run(NIOSender.java:140)<br>
> Progress: time: Wed, 12 Sep 2012 10:41:48 -0500 Selecting site:34<br>
> Submitted:8<br>
> Progress: time: Wed, 12 Sep 2012 10:41:51 -0500 Selecting site:34<br>
> Submitted:8<br>
> Progress: time: Wed, 12 Sep 2012 10:41:54 -0500 Selecting site:34<br>
> Submitted:8<br>
> ...<br>
><br>
><br>
> ------<br>
><br>
><br>
> I understand the long lines of unchanging "Progress: ..." reports -<br>
> the shared queue is busy, and so I am not expecting my job to be<br>
> executed right away. However, I don't understand why I'm getting these<br>
> "failed to cancel task" errors. I gave each individual app well more<br>
> than enough time for it to run to completion. And while I set the<br>
> timelimit on the entire process to be much smaller than it needs<br>
> (<profile namespace="globus" key="maxTime">60</profile> in sites.xml,<br>
> when the process could run for days)<br>
> I presumed the entire process would just get shut down after 60<br>
> seconds of runtime. Why is this cropping up? Thanks,<br>
><br>
><br>
> Jonathan<br>
</div></div></blockquote></div><br></div>