[Swift-devel] cache already contains error

Ketan Maheshwari ketancmaheshwari at gmail.com
Mon Apr 1 21:02:24 CDT 2013


Thanks Mike, that fixed the cache issue. However, now I am seeing an
unusual behavior from my Swift run:

The ampl run crashes after completing a fixed number of jobs (35 to be
precise).

Some diagnostics:

-- It runs to completion when I do a Swift resume. Once again only the next
35 jobs complete successfully. On a next resume the rest of them complete.

-- Runs outside of Swift with a bash for-loop using the same parameters as
in Swift script.

-- A catsn script of similar parameters runs to completion without any
failures. So, nothing seem to be wrong with the OS parameters.

I am using a single MCS workstation, no provider staging, no coasters.

The error message is:

Caused by: File not found:
/nfs2/ketan/powergridapps/swiftscripts/swift.work/inference-20130401-2043-78i7o5m3/shared/outdir/out_l0000_0000.0010.out

Which is reflected in the logs as well as in the workdir's info files.

Has anyone seen this kind of behavior? Any remedial suggestions?

Thanks,
Ketan


On Mon, Apr 1, 2013 at 5:59 PM, Michael Wilde <wilde at mcs.anl.gov> wrote:

> I think you need to make out 2-dimensional.
>
> Your script is going to evaluate "out[j] = cat(data)" for both i=0 and i=1.
>
> The second of those evaluations is probably encountering the "cache
> already contains" for j=0.
>
> If it didnt hit that (ie if you used the concurrent mapper) you'd likely
> then get an error that out[0] is already set.
>
> - Mike
>
>
> ----- Original Message -----
> > From: "Ketan Maheshwari" <ketancmaheshwari at gmail.com>
> > To: "Swift Devel" <swift-devel at ci.uchicago.edu>
> > Sent: Monday, April 1, 2013 5:49:36 PM
> > Subject: [Swift-devel] cache already contains error
> >
> >
> >
> >
> > Hi,
> >
> > I am running into the "cache already contains" error when using a
> > nested loop with file mappers. Here is a simple reproduction of the
> > issue with a nested loop variant of catsn.swift:
> >
> >
> >
> > type file;
> > app (file o) cat (file i){
> > cat @i stdout=@o;
> > }
> >
> >
> > #file out[];
> > #file out[]<concurrent_mapper; location="outdir",
> > prefix="f.",suffix=".out">;
> > file out[]<simple_mapper; location="outdir",
> > prefix="f.",suffix=".out">;
> >
> >
> > foreach i in [0:1] {
> > foreach j in [0:1]{
> > file data<"data.txt">;
> > out[j] = cat(data);
> > }
> > }
> >
> >
> > It runs into the cache error after completing few tasks successfully:
> >
> > $ swift catsn.swift
> > Swift trunk swift-r6410 cog-r3648
> >
> >
> > RunID: 20130401-1745-7khkyrqc
> > Progress: time: Mon, 01 Apr 2013 17:45:59 -0500
> > Progress: time: Mon, 01 Apr 2013 17:46:00 -0500 Selecting site:1
> > Active:1 Finished successfully:2
> > Execution failed:
> > Exception in cat:
> > Arguments: [data.txt]
> > Host: localhost
> > Directory: catsn-20130401-1745-7khkyrqc/jobs/y/cat-yzf9fg7l
> > Caused by:
> > The cache already contains
> > localhost:catsn-20130401-1745-7khkyrqc/shared/outdir/f.0000.out.
> > cat, catsn.swift, line 14
> >
> >
> > The cause, I think is that the nested loop triggers the same series
> > of random sequences in mappers code which collides. Both, the simple
> > and the concurrent mappers fail with same message.
> >
> >
> > Does anyone know of a workaround?
> >
> >
> > Thanks,
> > --
> > Ketan
> >
> >
> > _______________________________________________
> > Swift-devel mailing list
> > Swift-devel at ci.uchicago.edu
> > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel
> >
>



-- 
Ketan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/swift-devel/attachments/20130401/013dc88d/attachment.html>


More information about the Swift-devel mailing list