[Swift-devel] trunk FileNotFoundException

Ketan Maheshwari ketancmaheshwari at gmail.com
Tue Jul 12 17:47:56 CDT 2011


Some updates on this,

I tried the same workflow with same setting on 0.92.1 and I was getting the
"variable already in cache" message that I found from the archive is because
of a remapping of an actively mapped variable. That error went away when I
used concurrent_mapper instead of single_file_mapper.

-  //offset_file file <single_file_mapper;
file=@strcat("LGU/offset-",_size)>;
+ offset_file file <concurrent_mapper; location="LGU", prefix="offset-",
suffix=_size>;

With the above mod, I ran the same workflow with Swift trunk and seems the
java FileNotFoundException is gone. This seems to be a manifestation of the
same "var already in cache" bug of 0.92.

While the above issue does not appear anymore, the workflow still abruptly
gets halted with java.lang.NullPointerException. A complete stack is as
follows:

Progress:  time: Tue, 12 Jul 2011 17:30:10 -0500  Selecting site:402  Stage
in:12  Active:5  Stage out:1  Finished successfully:19
Execution failed:
    java.lang.NullPointerException
    at
org.griphyn.vdl.mapping.AbstractDataNode.getValue(AbstractDataNode.java:333)
    at org.griphyn.vdl.karajan.lib.SetFieldValue.log(SetFieldValue.java:71)
    at
org.griphyn.vdl.karajan.lib.SetFieldValue.function(SetFieldValue.java:38)
    at org.griphyn.vdl.karajan.lib.VDLFunction.post(VDLFunction.java:62)
    at
org.globus.cog.karajan.workflow.nodes.AbstractSequentialWithArguments.completed(AbstractSequentialWithArguments.java:194)
    at
org.globus.cog.karajan.workflow.nodes.FlowNode.complete(FlowNode.java:214)
    at
org.globus.cog.karajan.workflow.nodes.FlowContainer.post(FlowContainer.java:58)
    at
org.globus.cog.karajan.workflow.nodes.functions.Argument.post(Argument.java:48)
    at
org.globus.cog.karajan.workflow.nodes.AbstractSequentialWithArguments.completed(AbstractSequentialWithArguments.java:194)
    at
org.globus.cog.karajan.workflow.nodes.FlowNode.complete(FlowNode.java:214)
    at
org.globus.cog.karajan.workflow.nodes.FlowContainer.post(FlowContainer.java:58)
    at org.griphyn.vdl.karajan.lib.VDLFunction.post(VDLFunction.java:66)
    at
org.globus.cog.karajan.workflow.nodes.AbstractSequentialWithArguments.completed(AbstractSequentialWithArguments.java:194)
    at
org.globus.cog.karajan.workflow.nodes.FlowNode.complete(FlowNode.java:214)
    at
org.globus.cog.karajan.workflow.nodes.FlowContainer.post(FlowContainer.java:58)
    at
org.globus.cog.karajan.workflow.nodes.functions.Argument.post(Argument.java:48)
    at
org.globus.cog.karajan.workflow.nodes.AbstractSequentialWithArguments.completed(AbstractSequentialWithArguments.java:194)
    at
org.globus.cog.karajan.workflow.nodes.FlowNode.complete(FlowNode.java:214)
    at
org.globus.cog.karajan.workflow.nodes.FlowContainer.post(FlowContainer.java:58)
    at
org.globus.cog.karajan.workflow.nodes.functions.AbstractFunction.post(AbstractFunction.java:28)
    at
org.globus.cog.karajan.workflow.nodes.Sequential.startNext(Sequential.java:29)
    at
org.globus.cog.karajan.workflow.nodes.Sequential.executeChildren(Sequential.java:20)
    at
org.globus.cog.karajan.workflow.nodes.FlowContainer.execute(FlowContainer.java:63)
    at
org.globus.cog.karajan.workflow.nodes.FlowNode.restart(FlowNode.java:139)
    at
org.globus.cog.karajan.workflow.nodes.FlowNode.start(FlowNode.java:197)
    at
org.globus.cog.karajan.workflow.events.EventBus.start(EventBus.java:104)
    at
org.globus.cog.karajan.workflow.events.EventTargetPair.run(EventTargetPair.java:40)
    at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
    at java.util.concurrent.FutureTask.run(FutureTask.java:138)
    at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
    at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
    at java.lang.Thread.run(Thread.java:619)


I see the same exception on the log. I could not find any indications as to
what is causing this.

The log for this run can be found here:
http://www.ci.uchicago.edu/~ketan/files/postproc-20110712-1723-3uhgk3i6.log

--
Ketan

On Tue, Jul 12, 2011 at 2:16 PM, Ketan Maheshwari <
ketancmaheshwari at gmail.com> wrote:

> Mihael,
>
> I tried to further investigate the issue and from the logs it seems that
> Swift is trying to execute the mkoffset app before creating a jobs/
> directory in workdir. Could it be that this is an ordering issue. For
> instance, I see the following line:
>
> 2011-07-12 13:50:49,559-0500 DEBUG vdl:execute2 JOB_START
> jobid=mkoffset-mex8fvck tr=mkoffset arguments=[200.0, 60.0]
> tmpdir=postproc-20110712-1343-eczky6ob/jobs/m/mkoffset-mex8fvck
> host=localhost
>
> but do not see a createdir corresponding to above.
>
> I have ran this workflow successfully with 0.92.1 so, I am pretty sure that
> it works correctly as far as order of execution is concerned.
>
> Thanks for any more insights into this.
>
> Regards,
> Ketan
>
>
> ---------- Forwarded message ----------
> From: Ketan Maheshwari <ketancmaheshwari at gmail.com>
> Date: Mon, Jul 11, 2011 at 10:31 PM
> Subject: trunk FileNotFoundException
> To: swift-user at ci.uchicago.edu
>
>
> Hello,
>
> Using Swift trunk, I am running the SCEC workflow from Communicado using
> ranger, localhost and OSG resources.
>
> One particular app 'mkoffset' which is destined to run on localhost is
> faulting with FileNotFoundException.
>
> The log does give information on its mapping and when it gets 'cleared'.
>
> The config, tc, sites and log files for the run could be found here:
> http://www.mcs.anl.gov/~ketan/files/bundle.tgz (log is 90M, upload size
> exceeded!)
>
> The error stack that I am getting on stdout is:
>
> Progress:  time: Mon, 11 Jul 2011 22:16:38 -0500  Selecting site:390  Stage
> in:16  Active:9  Checking status:1  Finished successfully:36 Failed but can
> retry:3
> org.globus.cog.abstraction.impl.file.FileNotFoundException: File not found:
> /var/tmp/postproc-20110711-2209-bx2qm0nb/jobs/e/mkoffset-ea7xcuck/stderr.txt
>     at
> org.globus.cog.abstraction.impl.file.local.FileResourceImpl.getFile(FileResourceImpl.java:225)
>     at
> org.globus.cog.abstraction.impl.file.local.FileResourceImpl.putFile(FileResourceImpl.java:268)
>     at
> org.globus.cog.abstraction.impl.file.AbstractFileResource.putFile(AbstractFileResource.java:158)
>     at
> org.globus.cog.abstraction.impl.fileTransfer.DelegatedFileTransferHandler.doDestination(DelegatedFileTransferHandler.java:314)
>     at
> org.globus.cog.abstraction.impl.fileTransfer.CachingDelegatedFileTransferHandler.doDestination(CachingDelegatedFileTransferHandler.java:46)
>     at
> org.globus.cog.abstraction.impl.fileTransfer.DelegatedFileTransferHandler.run(DelegatedFileTransferHandler.java:487)
>     at java.lang.Thread.run(Thread.java:619)
> org.globus.cog.abstraction.impl.file.FileNotFoundException: File not found:
> /var/tmp/postproc-20110711-2209-bx2qm0nb/jobs/e/mkoffset-ea7xcuck/LGU/offset-128
>     at
> org.globus.cog.abstraction.impl.file.local.FileResourceImpl.getFile(FileResourceImpl.java:225)
>     at
> org.globus.cog.abstraction.impl.file.local.FileResourceImpl.putFile(FileResourceImpl.java:268)
>     at
> org.globus.cog.abstraction.impl.file.AbstractFileResource.putFile(AbstractFileResource.java:158)
>     at
> org.globus.cog.abstraction.impl.fileTransfer.DelegatedFileTransferHandler.doDestination(DelegatedFileTransferHandler.java:314)
>     at
> org.globus.cog.abstraction.impl.fileTransfer.CachingDelegatedFileTransferHandler.doDestination(CachingDelegatedFileTransferHandler.java:46)
>     at
> org.globus.cog.abstraction.impl.fileTransfer.DelegatedFileTransferHandler.run(DelegatedFileTransferHandler.java:487)
>     at java.lang.Thread.run(Thread.java:619)
> Progress:  time: Mon, 11 Jul 2011 22:16:39 -0500  Selecting site:389  Stage
> in:16  Active:9  Checking status:1  Finished successfully:38 Failed but can
> retry:4
> Execution failed:
>     java.lang.NullPointerException
>     at
> org.griphyn.vdl.mapping.AbstractDataNode.getValue(AbstractDataNode.java:333)
>     at org.griphyn.vdl.karajan.lib.SetFieldValue.log(SetFieldValue.java:71)
>     at
> org.griphyn.vdl.karajan.lib.SetFieldValue.function(SetFieldValue.java:38)
>     at org.griphyn.vdl.karajan.lib.VDLFunction.post(VDLFunction.java:62)
>     at
> org.globus.cog.karajan.workflow.nodes.AbstractSequentialWithArguments.completed(AbstractSequentialWithArguments.java:194)
>     at
> org.globus.cog.karajan.workflow.nodes.FlowNode.complete(FlowNode.java:214)
>     at
> org.globus.cog.karajan.workflow.nodes.FlowContainer.post(FlowContainer.java:58)
>     at
> org.globus.cog.karajan.workflow.nodes.functions.Argument.post(Argument.java:48)
>     at
> org.globus.cog.karajan.workflow.nodes.AbstractSequentialWithArguments.completed(AbstractSequentialWithArguments.java:194)
>     at
> org.globus.cog.karajan.workflow.nodes.FlowNode.complete(FlowNode.java:214)
>     at
> org.globus.cog.karajan.workflow.nodes.FlowContainer.post(FlowContainer.java:58)
>     at org.griphyn.vdl.karajan.lib.VDLFunction.post(VDLFunction.java:66)
>     at
> org.globus.cog.karajan.workflow.nodes.AbstractSequentialWithArguments.completed(AbstractSequentialWithArguments.java:194)
>     at
> org.globus.cog.karajan.workflow.nodes.FlowNode.complete(FlowNode.java:214)
>     at
> org.globus.cog.karajan.workflow.nodes.FlowContainer.post(FlowContainer.java:58)
>
>
> Any clues?
>
> Thanks,
> --
> Ketan
>
>
>
>
>
> --
> Ketan
>
>
>


-- 
Ketan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/swift-devel/attachments/20110712/0bb19b8b/attachment.html>


More information about the Swift-devel mailing list