[Swift-devel] duplicated job submission in swift-0.92?

Michael Wilde wilde at mcs.anl.gov
Wed Mar 30 23:32:08 CDT 2011


It turns out that foreach loops in 0.92 based on either an array constant like [0:9] or on an array returned by readData() work fine, which explains why I didnt see the problem in my large modftdock tests. My outer loop was based on readData and my inner loop was based on an array constant.

It would be interesting to learn (and fix) how this eluded the language tests.

- Mike


----- Original Message -----
> Yeah, I'm done for now. Except that Im a little baffled as to why this
> is turning up now, as I thought I was running much more complex
> scripts (generating hundreds of thousands of files) with no sign of
> behavior like this, on what I think is the same revision. And what Im
> running is (I *think*) from before the merge you are talking about.
> Unless I'm misunderstanding what you discovered in svn.
> 
> - Mike
> 
> 
> ----- Original Message -----
> > I think at this point we should stop testing 0.92 until we figure
> > out
> > the reason for the merge.
> >
> > Trunk contained a pretty dramatic change to the karajan engine and I
> > would expect badness like that on a merge back to a stable branch.
> > The
> > previous behaviour (double iterations) alone is a sign of badness,
> > and
> > so this new thing doesn't surprise me.
> >
> > Mihael
> >
> > On Wed, 2011-03-30 at 21:54 -0500, Michael Wilde wrote:
> > > There seems to be something non-deterministic about this script:
> > >
> > > com$ cat zz5.swift
> > > int arr[];
> > > int brr[];
> > >
> > > arr[0]=1;
> > > arr[1]=2;
> > >
> > > brr = [1:2];
> > >
> > > trace("arr",arr);
> > > trace("brr",brr);
> > >
> > > foreach a in arr {
> > >   trace("for", a);
> > > }
> > >
> > > com$
> > >
> > > (By the way, Im seeing the same error on communicado)
> > >
> > > The script above sometimes prints 2, 3, or 4 instances of the
> > > trace() inside the foreach. And sometimes it hangs on one of the
> > > two
> > > trace statements outside the loop. Most cases, it prints all 6
> > > traces, as in the original failing case.
> > >
> > > com$ swift zz5.swift
> > > Swift svn swift-r4087 (swift modified locally) cog-r3051
> > >
> > > RunID: 20110330-2148-qf5anxr6
> > > Progress: time:2
> > > SwiftScript trace: arr, arr.$[]/2
> > > SwiftScript trace: brr, brr.$[]/2
> > > SwiftScript trace: for, 1
> > > SwiftScript trace: for, 2
> > > Final status: time:12
> > > Time: 1.163, rate: 14087 j/s
> > > com$ swift zz5.swift
> > > Swift svn swift-r4087 (swift modified locally) cog-r3051
> > >
> > > RunID: 20110330-2148-kouc9zq3
> > > Progress: time:3
> > > SwiftScript trace: arr, arr.$[]/2
> > > SwiftScript trace: brr, brr.$[]/2
> > > SwiftScript trace: for, 2
> > > SwiftScript trace: for, 2
> > > SwiftScript trace: for, 1
> > > SwiftScript trace: for, 1
> > > Final status: time:16
> > > Time: 1.214, rate: 13495 j/s
> > > com$ swift zz5.swift
> > > Swift svn swift-r4087 (swift modified locally) cog-r3051
> > >
> > > RunID: 20110330-2148-lksn2a17
> > > Progress: time:2
> > > SwiftScript trace: arr, arr.$[]/2
> > > SwiftScript trace: brr, brr.$[]/2
> > > SwiftScript trace: for, 1
> > > SwiftScript trace: for, 2
> > > SwiftScript trace: for, 2
> > > SwiftScript trace: for, 1
> > > Final status: time:17
> > > Time: 1.227, rate: 13352 j/s
> > > com$ swift zz5.swift
> > > Swift svn swift-r4087 (swift modified locally) cog-r3051
> > >
> > > RunID: 20110330-2148-tl2xtxx6
> > > Progress: time:1
> > > SwiftScript trace: arr, arr.$[]/2
> > > SwiftScript trace: for, 1
> > > SwiftScript trace: for, 1
> > > SwiftScript trace: brr, brr.$[]/2
> > > SwiftScript trace: for, 2
> > > SwiftScript trace: for, 2
> > > Final status: time:14
> > > Time: 1.224, rate: 13385 j/s
> > > com$ swift zz5.swift
> > > Swift svn swift-r4087 (swift modified locally) cog-r3051
> > >
> > > RunID: 20110330-2148-mk5aypbg
> > > Progress: time:6
> > > SwiftScript trace: arr, arr.$[]/2
> > > SwiftScript trace: brr, brr.$[]/2
> > > SwiftScript trace: for, 2
> > > SwiftScript trace: for, 1
> > > SwiftScript trace: for, 1
> > > SwiftScript trace: for, 2
> > > Final status: time:17
> > > Time: 1.191, rate: 13756 j/s
> > > com$ swift zz5.swift
> > > Swift svn swift-r4087 (swift modified locally) cog-r3051
> > >
> > > RunID: 20110330-2148-hgcbaxga
> > > Progress: time:2
> > > SwiftScript trace: arr, arr.$[]/2
> > > SwiftScript trace: for, 1
> > > SwiftScript trace: for, 2
> > > SwiftScript trace: for, 2
> > > SwiftScript trace: for, 1
> > > com$ swift zz5.swift
> > > Swift svn swift-r4087 (swift modified locally) cog-r3051
> > >
> > > RunID: 20110330-2149-oaa0kuy8
> > > Progress:SwiftScript trace: arr, arr.$[]/2
> > >   time:9
> > > SwiftScript trace: for, 2
> > > SwiftScript trace: brr, brr.$[]/2
> > > SwiftScript trace: for, 2
> > > SwiftScript trace: for, 1
> > > Final status: time:17
> > > Time: 1.241, rate: 13202 j/s
> > > com$
> > >
> > >
> > > ----- Original Message -----
> > > > On Wed, 2011-03-30 at 20:37 -0500, Michael Wilde wrote:
> > > > > OK, what am I missing?
> > > >
> > > > Nothing. That shouldn't be happening.
> > > >
> > > > Here's what (approximately) you should get:
> > > > Swift svn swift-r3526 (swift modified locally) cog-r656 (cog
> > > > modified
> > > > locally)
> > > >
> > > > RunID: 20110330-1839-y71gls4a
> > > > Progress: time:0
> > > > [Misc] WARN pool-1-thread-4 - SwiftScript trace: for, 2
> > > > [Misc] WARN pool-1-thread-1 - SwiftScript trace: for, 1
> > > > Final status: time:54
> > > >
> > > > >
> > > > > int arr[];
> > > > >
> > > > > arr[0]=1;
> > > > > arr[1]=2;
> > > > >
> > > > > foreach a in arr {
> > > > >   trace("for", a);
> > > > > }
> > > > >
> > > > > login1$ swift zz3.swift
> > > > > Swift svn swift-r4157 cog-r3056
> > > > >
> > > > > RunID: 20110331-0134-bfzhkgaa
> > > > > Progress:
> > > > > SwiftScript trace: for, 2
> > > > > SwiftScript trace: for, 1
> > > > > SwiftScript trace: for, 1
> > > > > SwiftScript trace: for, 2
> > > > > Final status:
> > > > >
> > > > > When did the foreach loop become the twice-each loop?
> > > > >
> > > > > I need to try some other revision and hosts with this.
> > > > >
> > > > > - Mike
> > > > >
> > > > >
> > > > > ----- Original Message -----
> > > > > > Wow, I didn't know we can do that! I treat the docs too
> > > > > > canonically :P
> > > > > >
> > > > > > 2011/3/30 Michael Wilde <wilde at mcs.anl.gov>:
> > > > > >
> > > > > > > login1$ cat zz2.swift
> > > > > > >
> > > > > > > foreach a in [0:3] {
> > > > > > >  trace("for", a);
> > > > > > > }
> > > > > > >
> > > > > > > login1$ swift zz2.swift
> > > > > > > Swift svn swift-r4157 cog-r3056
> > > > > > >
> > > > > > > RunID: 20110331-0057-huo8jei0
> > > > > > > Progress:
> > > > > > > SwiftScript trace: for, 1
> > > > > > > SwiftScript trace: for, 3
> > > > > > > SwiftScript trace: for, 2
> > > > > > > SwiftScript trace: for, 0
> > > > > > > Final status:
> > > > > > > login1$
> > > > > > >
> > > > > > > I suspect we need to make this more clear in the user
> > > > > > > guide
> > > > > > > and
> > > > > > > tutorials :)
> > > > > >
> > > > > > I agree.
> > > > > >
> > > > > > >
> > > > > > > - Mike
> > > > > > >
> > > > > > >
> > > > > > > ----- Original Message -----
> > > > > > >> Or just use the concurrent mapper to let swift handle the
> > > > > > >> output
> > > > > > >> naming itself. The resume files can't persist through
> > > > > > >> multiple
> > > > > > >> sessions though.
> > > > > > >>
> > > > > > >> 2011/3/30 Michael Wilde <wilde at mcs.anl.gov>:
> > > > > > >> > The most common case for this error occurs when two
> > > > > > >> > iterations
> > > > > > >> > within a foreach loop map an output file to the same
> > > > > > >> > physical
> > > > > > >> > file
> > > > > > >> > name. When swift runs and tries to put the output
> > > > > > >> > object
> > > > > > >> > into
> > > > > > >> > its
> > > > > > >> > site cache, it sees that a file of the name name is
> > > > > > >> > already
> > > > > > >> > in
> > > > > > >> > the
> > > > > > >> > cache, and its semantics do not allow that.
> > > > > > >> >
> > > > > > >> > I have not yet stared at this code long enough to see
> > > > > > >> > if
> > > > > > >> > this
> > > > > > >> > explains what is happening here.
> > > > > > >> >
> > > > > > >> > I also dont know why it might work under one version
> > > > > > >> > and
> > > > > > >> > fail
> > > > > > >> > under
> > > > > > >> > 0.92. If the above situation is occurring, perhaps
> > > > > > >> > there
> > > > > > >> > is
> > > > > > >> > some
> > > > > > >> > randomness involved: loop iteration ordering; filename
> > > > > > >> > generation
> > > > > > >> > randomness or difference, etc.
> > > > > > >> >
> > > > > > >> > But I would debug with that in mind: make sure that all
> > > > > > >> > *output*
> > > > > > >> > fie
> > > > > > >> > names mapped by the script are unique. Ideally, one
> > > > > > >> > should be
> > > > > > >> > able
> > > > > > >> > to find the culprit by grepping the swift log for all
> > > > > > >> > the
> > > > > > >> > mapped
> > > > > > >> > file names and look for duplicates.
> > > > > > >> >
> > > > > > >> > - Mike
> > > > > > >> >
> > > > > > >> >
> > > > > > >> > ----- Original Message -----
> > > > > > >> >> Or maybe local variables are static? Maybe they mapped
> > > > > > >> >> to
> > > > > > >> >> different
> > > > > > >> >> files but to the same cache object? But I have been
> > > > > > >> >> doing
> > > > > > >> >> local
> > > > > > >> >> variables in my own workflows though.
> > > > > > >> >>
> > > > > > >> >> 2011/3/30 Jonathan Monette <jon.monette at gmail.com>:
> > > > > > >> >> > Ok. I understand this error better. But shouldn't
> > > > > > >> >> > that
> > > > > > >> >> > be
> > > > > > >> >> > a
> > > > > > >> >> > different
> > > > > > >> >> > error then? Like a and b are mapped to the same
> > > > > > >> >> > file?
> > > > > > >> >> > I
> > > > > > >> >> > don't
> > > > > > >> >> > know
> > > > > > >> >> > if Swift
> > > > > > >> >> > can know this but looking at the explanation and
> > > > > > >> >> > error
> > > > > > >> >> > it
> > > > > > >> >> > should
> > > > > > >> >> > unless this
> > > > > > >> >> > cache message has a deeper meaning.
> > > > > > >> >> >
> > > > > > >> >> > On Wed, Mar 30, 2011 at 6:21 PM, Allan Espinosa
> > > > > > >> >> > <aespinosa at cs.uchicago.edu>
> > > > > > >> >> > wrote:
> > > > > > >> >> >>
> > > > > > >> >> >> I had this error before when two output mapper
> > > > > > >> >> >> objects
> > > > > > >> >> >> mapped
> > > > > > >> >> >> to
> > > > > > >> >> >> the same
> > > > > > >> >> >> file.
> > > > > > >> >> >>
> > > > > > >> >> >> $ swift bug_same.swift
> > > > > > >> >> >> Swift svn swift-r4208 cog-r3073
> > > > > > >> >> >>
> > > > > > >> >> >> RunID: 20110330-1818-ygec7ppa
> > > > > > >> >> >> Progress: time:0
> > > > > > >> >> >> The cache already contains
> > > > > > >> >> >> localhost:bug_same-20110330-1818-ygec7ppa/shared/foo.
> > > > > > >> >> >>
> > > > > > >> >> >> The cache already contains
> > > > > > >> >> >> localhost:bug_same-20110330-1818-ygec7ppa/shared/foo.
> > > > > > >> >> >>
> > > > > > >> >> >> Progress: time:1960 Stage in:1 Finished
> > > > > > >> >> >> successfully:1
> > > > > > >> >> >> The cache already contains
> > > > > > >> >> >> localhost:bug_same-20110330-1818-ygec7ppa/shared/foo.
> > > > > > >> >> >>
> > > > > > >> >> >> [aespinosa at communicado testing]$
> > > > > > >> >> >> [aespinosa at communicado testing]$ cat bug_same.swift
> > > > > > >> >> >> type file;
> > > > > > >> >> >>
> > > > > > >> >> >> app (file out) echo(string input) {
> > > > > > >> >> >>  echo input stdout=@filename(out);
> > > > > > >> >> >> }
> > > > > > >> >> >>
> > > > > > >> >> >> file a <"foo">;
> > > > > > >> >> >> file b <"foo">;
> > > > > > >> >> >>
> > > > > > >> >> >> a = echo("hello world");
> > > > > > >> >> >> b = echo("foo bar");
> > > > > > >> >> >>
> > > > > > >> >> >> But i think you should be using other Swift mappers
> > > > > > >> >> >> that
> > > > > > >> >> >> does
> > > > > > >> >> >> auto-numbering of files by default.
> > > > > > >> >> >>
> > > > > > >> >> >> -Allan
> > > > > > >> >> >>
> > > > > > >> >> >> 2011/3/30 Zhao Zhang <zhaozhang at uchicago.edu>:
> > > > > > >> >> >> > Hi guys,
> > > > > > >> >> >> >
> > > > > > >> >> >> > I am seeing something weird in swfit-0.92. Any
> > > > > > >> >> >> > idea
> > > > > > >> >> >> > about
> > > > > > >> >> >> > this?
> > > > > > >> >> >> > The swift script is very simple:
> > > > > > >> >> >> >
> > > > > > >> >> >> > zzhang at sandbox:~/workplace/Andrey> cat
> > > > > > >> >> >> > movies.swift
> > > > > > >> >> >> > type Pickle {}
> > > > > > >> >> >> > type History {}
> > > > > > >> >> >> > type Image {}
> > > > > > >> >> >> >
> > > > > > >> >> >> > app (History historyout) movie_graph (int rerun,
> > > > > > >> >> >> > int
> > > > > > >> >> >> > epochs,
> > > > > > >> >> >> > Pickle
> > > > > > >> >> >> > picklefile)
> > > > > > >> >> >> > {
> > > > > > >> >> >> >   movie_graph rerun epochs;
> > > > > > >> >> >> > }
> > > > > > >> >> >> >
> > > > > > >> >> >> > int arr[];
> > > > > > >> >> >> > iterate i
> > > > > > >> >> >> > {
> > > > > > >> >> >> >  arr[i] = i+1;
> > > > > > >> >> >> > }until(i == 1);
> > > > > > >> >> >> >
> > > > > > >> >> >> > int epochs;
> > > > > > >> >> >> > epochs = 3;
> > > > > > >> >> >> > Pickle picklefile <single_file_mapper;
> > > > > > >> >> >> > file="for_movies.pickled">;
> > > > > > >> >> >> > foreach a in arr{
> > > > > > >> >> >> >  History historyout <single_file_mapper;
> > > > > > >> >> >> >  file=@strcat("output/rerun", a,
> > > > > > >> >> >> > "/histories.pickled-", a)>;
> > > > > > >> >> >> >  historyout = movie_graph(a, epochs, picklefile);
> > > > > > >> >> >> > }
> > > > > > >> >> >> >
> > > > > > >> >> >> >
> > > > > > >> >> >> >
> > > > > > >> >> >> > I ran the script with the latest 0.92 version,
> > > > > > >> >> >> > which is
> > > > > > >> >> >> > loaded
> > > > > > >> >> >> > as
> > > > > > >> >> >> > a
> > > > > > >> >> >> > module
> > > > > > >> >> >> > on beagle. The I saw this:
> > > > > > >> >> >> > zzhang at sandbox:~/workplace/Andrey> swift -tc.file
> > > > > > >> >> >> > ./tc.data
> > > > > > >> >> >> > movies.swift
> > > > > > >> >> >> > Variable epochs defined in scope 99878388 shadows
> > > > > > >> >> >> > variable
> > > > > > >> >> >> > of
> > > > > > >> >> >> > same name
> > > > > > >> >> >> > in
> > > > > > >> >> >> > scope 1813605401
> > > > > > >> >> >> > Variable picklefile defined in scope 99878388
> > > > > > >> >> >> > shadows
> > > > > > >> >> >> > variable
> > > > > > >> >> >> > of
> > > > > > >> >> >> > same
> > > > > > >> >> >> > name
> > > > > > >> >> >> > in scope 1813605401
> > > > > > >> >> >> > Swift svn swift-r4157 cog-r3056
> > > > > > >> >> >> >
> > > > > > >> >> >> > RunID: 20110330-1636-ev8vm8gb
> > > > > > >> >> >> > Progress:
> > > > > > >> >> >> > Progress: Selecting site:3 Active:1
> > > > > > >> >> >> > Progress: Selecting site:3 Checking status:1
> > > > > > >> >> >> > Progress: Selecting site:2 Stage in:1 Finished
> > > > > > >> >> >> > successfully:1
> > > > > > >> >> >> > Progress: Selecting site:2 Active:1 Finished
> > > > > > >> >> >> > successfully:1
> > > > > > >> >> >> > Progress: Selecting site:2 Active:1 Finished
> > > > > > >> >> >> > successfully:1
> > > > > > >> >> >> > Progress: Selecting site:1 Stage in:1 Finished
> > > > > > >> >> >> > successfully:2
> > > > > > >> >> >> > Progress: Selecting site:1 Active:1 Finished
> > > > > > >> >> >> > successfully:2
> > > > > > >> >> >> > Progress: Selecting site:1 Checking status:1
> > > > > > >> >> >> > Finished
> > > > > > >> >> >> > successfully:2
> > > > > > >> >> >> > The cache already contains
> > > > > > >> >> >> >
> > > > > > >> >> >> > localhost:movies-20110330-1636-ev8vm8gb/shared/output/rerun1/histories.pickled-1.
> > > > > > >> >> >> >
> > > > > > >> >> >> > Execution failed:
> > > > > > >> >> >> >        The cache already contains
> > > > > > >> >> >> >
> > > > > > >> >> >> > localhost:movies-20110330-1636-ev8vm8gb/shared/output/rerun1/histories.pickled-1.
> > > > > > >> >> >> >
> > > > > > >> >> >> >
> > > > > > >> >> >> > Then I switched to an older version, it worked
> > > > > > >> >> >> > well.
> > > > > > >> >> >> > zzhang at sandbox:~/workplace/Andrey> swift -tc.file
> > > > > > >> >> >> > ./tc.data
> > > > > > >> >> >> > movies.swift
> > > > > > >> >> >> > Variable epochs defined in scope 212602028
> > > > > > >> >> >> > shadows
> > > > > > >> >> >> > variable
> > > > > > >> >> >> > of
> > > > > > >> >> >> > same name
> > > > > > >> >> >> > in
> > > > > > >> >> >> > scope 1538939834
> > > > > > >> >> >> > Variable picklefile defined in scope 212602028
> > > > > > >> >> >> > shadows
> > > > > > >> >> >> > variable
> > > > > > >> >> >> > of same
> > > > > > >> >> >> > name
> > > > > > >> >> >> > in scope 1538939834
> > > > > > >> >> >> > Swift svn swift-r3291 (swift modified locally)
> > > > > > >> >> >> > cog-r2750
> > > > > > >> >> >> > (cog
> > > > > > >> >> >> > modified
> > > > > > >> >> >> > locally)
> > > > > > >> >> >> >
> > > > > > >> >> >> > RunID: 20110330-1639-gmbyz1qa
> > > > > > >> >> >> > Progress:
> > > > > > >> >> >> > Progress: Active:2
> > > > > > >> >> >> > Progress: Active:1 Checking status:1
> > > > > > >> >> >> > Final status: Finished successfully:2
> > > > > > >> >> >> _______________________________________________
> > > > > > >
> > > > >
> > >
> 
> --
> Michael Wilde
> Computation Institute, University of Chicago
> Mathematics and Computer Science Division
> Argonne National Laboratory
> 
> _______________________________________________
> Swift-devel mailing list
> Swift-devel at ci.uchicago.edu
> http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel

-- 
Michael Wilde
Computation Institute, University of Chicago
Mathematics and Computer Science Division
Argonne National Laboratory




More information about the Swift-devel mailing list