Some updates on this,<br><br>I tried the same workflow with same setting on 0.92.1 and I was getting the "variable already in cache" message that I found from the archive is because of a remapping of an actively mapped variable. That error went away when I used concurrent_mapper instead of single_file_mapper.<br>
<br>- //offset_file file <single_file_mapper; file=@strcat("LGU/offset-",_size)>;<br>+ offset_file file <concurrent_mapper; location="LGU", prefix="offset-", suffix=_size>;<br><br>
With the above mod, I ran the same workflow with Swift trunk and seems the java FileNotFoundException is gone. This seems to be a manifestation of the same "var already in cache" bug of 0.92.<br><br>While the above issue does not appear anymore, the workflow still abruptly gets halted with java.lang.NullPointerException. A complete stack is as follows:<br>
<br>Progress: time: Tue, 12 Jul 2011 17:30:10 -0500 Selecting site:402 Stage in:12 Active:5 Stage out:1 Finished successfully:19<br>Execution failed:<br> java.lang.NullPointerException<br> at org.griphyn.vdl.mapping.AbstractDataNode.getValue(AbstractDataNode.java:333)<br>
at org.griphyn.vdl.karajan.lib.SetFieldValue.log(SetFieldValue.java:71)<br> at org.griphyn.vdl.karajan.lib.SetFieldValue.function(SetFieldValue.java:38)<br> at org.griphyn.vdl.karajan.lib.VDLFunction.post(VDLFunction.java:62)<br>
at org.globus.cog.karajan.workflow.nodes.AbstractSequentialWithArguments.completed(AbstractSequentialWithArguments.java:194)<br> at org.globus.cog.karajan.workflow.nodes.FlowNode.complete(FlowNode.java:214)<br> at org.globus.cog.karajan.workflow.nodes.FlowContainer.post(FlowContainer.java:58)<br>
at org.globus.cog.karajan.workflow.nodes.functions.Argument.post(Argument.java:48)<br> at org.globus.cog.karajan.workflow.nodes.AbstractSequentialWithArguments.completed(AbstractSequentialWithArguments.java:194)<br>
at org.globus.cog.karajan.workflow.nodes.FlowNode.complete(FlowNode.java:214)<br> at org.globus.cog.karajan.workflow.nodes.FlowContainer.post(FlowContainer.java:58)<br> at org.griphyn.vdl.karajan.lib.VDLFunction.post(VDLFunction.java:66)<br>
at org.globus.cog.karajan.workflow.nodes.AbstractSequentialWithArguments.completed(AbstractSequentialWithArguments.java:194)<br> at org.globus.cog.karajan.workflow.nodes.FlowNode.complete(FlowNode.java:214)<br> at org.globus.cog.karajan.workflow.nodes.FlowContainer.post(FlowContainer.java:58)<br>
at org.globus.cog.karajan.workflow.nodes.functions.Argument.post(Argument.java:48)<br> at org.globus.cog.karajan.workflow.nodes.AbstractSequentialWithArguments.completed(AbstractSequentialWithArguments.java:194)<br>
at org.globus.cog.karajan.workflow.nodes.FlowNode.complete(FlowNode.java:214)<br> at org.globus.cog.karajan.workflow.nodes.FlowContainer.post(FlowContainer.java:58)<br> at org.globus.cog.karajan.workflow.nodes.functions.AbstractFunction.post(AbstractFunction.java:28)<br>
at org.globus.cog.karajan.workflow.nodes.Sequential.startNext(Sequential.java:29)<br> at org.globus.cog.karajan.workflow.nodes.Sequential.executeChildren(Sequential.java:20)<br> at org.globus.cog.karajan.workflow.nodes.FlowContainer.execute(FlowContainer.java:63)<br>
at org.globus.cog.karajan.workflow.nodes.FlowNode.restart(FlowNode.java:139)<br> at org.globus.cog.karajan.workflow.nodes.FlowNode.start(FlowNode.java:197)<br> at org.globus.cog.karajan.workflow.events.EventBus.start(EventBus.java:104)<br>
at org.globus.cog.karajan.workflow.events.EventTargetPair.run(EventTargetPair.java:40)<br> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)<br> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)<br>
at java.util.concurrent.FutureTask.run(FutureTask.java:138)<br> at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)<br> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)<br>
at java.lang.Thread.run(Thread.java:619)<br><br><br>I see the same exception on the log. I could not find any indications as to what is causing this.<br><br>The log for this run can be found here: <a href="http://www.ci.uchicago.edu/~ketan/files/postproc-20110712-1723-3uhgk3i6.log">http://www.ci.uchicago.edu/~ketan/files/postproc-20110712-1723-3uhgk3i6.log</a><br>
<br>--<br>Ketan<br><br><div class="gmail_quote">On Tue, Jul 12, 2011 at 2:16 PM, Ketan Maheshwari <span dir="ltr"><<a href="mailto:ketancmaheshwari@gmail.com">ketancmaheshwari@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">
Mihael,<br><br>I tried to further investigate the issue and from the logs it seems that Swift is trying to execute the mkoffset app before creating a jobs/ directory in workdir. Could it be that this is an ordering issue. For instance, I see the following line:<br>
<br>2011-07-12 13:50:49,559-0500 DEBUG vdl:execute2 JOB_START jobid=mkoffset-mex8fvck tr=mkoffset arguments=[200.0, 60.0] tmpdir=postproc-20110712-1343-eczky6ob/jobs/m/mkoffset-mex8fvck host=localhost<br><br>but do not see a createdir corresponding to above.<br>
<br>I have ran this workflow successfully with 0.92.1 so, I am pretty sure that it works correctly as far as order of execution is concerned.<br><br>Thanks for any more insights into this.<br><br>Regards,<br>Ketan<div><div>
</div><div class="h5"><br><br>
<div class="gmail_quote">---------- Forwarded message ----------<br>From: <b class="gmail_sendername">Ketan Maheshwari</b> <span dir="ltr"><<a href="mailto:ketancmaheshwari@gmail.com" target="_blank">ketancmaheshwari@gmail.com</a>></span><br>
Date: Mon, Jul 11, 2011 at 10:31 PM<br>Subject: trunk FileNotFoundException<br>To: <a href="mailto:swift-user@ci.uchicago.edu" target="_blank">swift-user@ci.uchicago.edu</a><br><br><br>Hello,<br><br>Using Swift trunk, I am running the SCEC workflow from Communicado using ranger, localhost and OSG resources.<br>
<br>One particular app 'mkoffset' which is destined to run on localhost is faulting with FileNotFoundException. <br clear="all">
<br>The log does give information on its mapping and when it gets 'cleared'. <br><br>The config, tc, sites and log files for the run could be found here: <a href="http://www.mcs.anl.gov/%7Eketan/files/bundle.tgz" target="_blank">http://www.mcs.anl.gov/~ketan/files/bundle.tgz</a> (log is 90M, upload size exceeded!)<br>
<br>The error stack that I am getting on stdout is:<br><br>Progress: time: Mon, 11 Jul 2011 22:16:38 -0500 Selecting site:390 Stage in:16 Active:9 Checking status:1 Finished successfully:36 Failed but can retry:3<br>
org.globus.cog.abstraction.impl.file.FileNotFoundException: File not found: /var/tmp/postproc-20110711-2209-bx2qm0nb/jobs/e/mkoffset-ea7xcuck/stderr.txt<br> at org.globus.cog.abstraction.impl.file.local.FileResourceImpl.getFile(FileResourceImpl.java:225)<br>
at org.globus.cog.abstraction.impl.file.local.FileResourceImpl.putFile(FileResourceImpl.java:268)<br> at org.globus.cog.abstraction.impl.file.AbstractFileResource.putFile(AbstractFileResource.java:158)<br> at org.globus.cog.abstraction.impl.fileTransfer.DelegatedFileTransferHandler.doDestination(DelegatedFileTransferHandler.java:314)<br>
at org.globus.cog.abstraction.impl.fileTransfer.CachingDelegatedFileTransferHandler.doDestination(CachingDelegatedFileTransferHandler.java:46)<br> at org.globus.cog.abstraction.impl.fileTransfer.DelegatedFileTransferHandler.run(DelegatedFileTransferHandler.java:487)<br>
at java.lang.Thread.run(Thread.java:619)<br>org.globus.cog.abstraction.impl.file.FileNotFoundException: File not found: /var/tmp/postproc-20110711-2209-bx2qm0nb/jobs/e/mkoffset-ea7xcuck/LGU/offset-128<br> at org.globus.cog.abstraction.impl.file.local.FileResourceImpl.getFile(FileResourceImpl.java:225)<br>
at org.globus.cog.abstraction.impl.file.local.FileResourceImpl.putFile(FileResourceImpl.java:268)<br> at org.globus.cog.abstraction.impl.file.AbstractFileResource.putFile(AbstractFileResource.java:158)<br> at org.globus.cog.abstraction.impl.fileTransfer.DelegatedFileTransferHandler.doDestination(DelegatedFileTransferHandler.java:314)<br>
at org.globus.cog.abstraction.impl.fileTransfer.CachingDelegatedFileTransferHandler.doDestination(CachingDelegatedFileTransferHandler.java:46)<br> at org.globus.cog.abstraction.impl.fileTransfer.DelegatedFileTransferHandler.run(DelegatedFileTransferHandler.java:487)<br>
at java.lang.Thread.run(Thread.java:619)<br>Progress: time: Mon, 11 Jul 2011 22:16:39 -0500 Selecting site:389 Stage in:16 Active:9 Checking status:1 Finished successfully:38 Failed but can retry:4<br>Execution failed:<br>
java.lang.NullPointerException<br> at org.griphyn.vdl.mapping.AbstractDataNode.getValue(AbstractDataNode.java:333)<br> at org.griphyn.vdl.karajan.lib.SetFieldValue.log(SetFieldValue.java:71)<br> at org.griphyn.vdl.karajan.lib.SetFieldValue.function(SetFieldValue.java:38)<br>
at org.griphyn.vdl.karajan.lib.VDLFunction.post(VDLFunction.java:62)<br> at org.globus.cog.karajan.workflow.nodes.AbstractSequentialWithArguments.completed(AbstractSequentialWithArguments.java:194)<br> at org.globus.cog.karajan.workflow.nodes.FlowNode.complete(FlowNode.java:214)<br>
at org.globus.cog.karajan.workflow.nodes.FlowContainer.post(FlowContainer.java:58)<br> at org.globus.cog.karajan.workflow.nodes.functions.Argument.post(Argument.java:48)<br> at org.globus.cog.karajan.workflow.nodes.AbstractSequentialWithArguments.completed(AbstractSequentialWithArguments.java:194)<br>
at org.globus.cog.karajan.workflow.nodes.FlowNode.complete(FlowNode.java:214)<br> at org.globus.cog.karajan.workflow.nodes.FlowContainer.post(FlowContainer.java:58)<br> at org.griphyn.vdl.karajan.lib.VDLFunction.post(VDLFunction.java:66)<br>
at org.globus.cog.karajan.workflow.nodes.AbstractSequentialWithArguments.completed(AbstractSequentialWithArguments.java:194)<br> at org.globus.cog.karajan.workflow.nodes.FlowNode.complete(FlowNode.java:214)<br> at org.globus.cog.karajan.workflow.nodes.FlowContainer.post(FlowContainer.java:58)<br>
<br><br>Any clues?<br><br>Thanks,<br>-- <br><font color="#888888">Ketan<br><br><br>
</font></div><br><br clear="all"><br></div></div>-- <br><font color="#888888">Ketan<br><br><br>
</font></blockquote></div><br><br clear="all"><br>-- <br>Ketan<br><br><br>