[Swift-devel] duplicated job submission in swift-0.92?

Mihael Hategan hategan at mcs.anl.gov
Wed Mar 30 22:04:29 CDT 2011


I think at this point we should stop testing 0.92 until we figure out
the reason for the merge.

Trunk contained a pretty dramatic change to the karajan engine and I
would expect badness like that on a merge back to a stable branch. The
previous behaviour (double iterations) alone is a sign of badness, and
so this new thing doesn't surprise me.

Mihael

On Wed, 2011-03-30 at 21:54 -0500, Michael Wilde wrote:
> There seems to be something non-deterministic about this script:
> 
> com$ cat zz5.swift
> int arr[];
> int brr[];
> 
> arr[0]=1;
> arr[1]=2;
> 
> brr = [1:2];
> 
> trace("arr",arr);
> trace("brr",brr);
> 
> foreach a in arr {
>   trace("for", a);
> }
> 
> com$ 
> 
> (By the way, Im seeing the same error on communicado)
> 
> The script above sometimes prints 2, 3, or 4 instances of the trace() inside the foreach.  And sometimes it hangs on one of the two trace statements outside the loop. Most cases, it prints all 6 traces, as in the original failing case.
> 
> com$ swift zz5.swift
> Swift svn swift-r4087 (swift modified locally) cog-r3051
> 
> RunID: 20110330-2148-qf5anxr6
> Progress:  time:2
> SwiftScript trace: arr, arr.$[]/2
> SwiftScript trace: brr, brr.$[]/2
> SwiftScript trace: for, 1
> SwiftScript trace: for, 2
> Final status:  time:12
> Time: 1.163, rate: 14087 j/s
> com$ swift zz5.swift
> Swift svn swift-r4087 (swift modified locally) cog-r3051
> 
> RunID: 20110330-2148-kouc9zq3
> Progress:  time:3
> SwiftScript trace: arr, arr.$[]/2
> SwiftScript trace: brr, brr.$[]/2
> SwiftScript trace: for, 2
> SwiftScript trace: for, 2
> SwiftScript trace: for, 1
> SwiftScript trace: for, 1
> Final status:  time:16
> Time: 1.214, rate: 13495 j/s
> com$ swift zz5.swift
> Swift svn swift-r4087 (swift modified locally) cog-r3051
> 
> RunID: 20110330-2148-lksn2a17
> Progress:  time:2
> SwiftScript trace: arr, arr.$[]/2
> SwiftScript trace: brr, brr.$[]/2
> SwiftScript trace: for, 1
> SwiftScript trace: for, 2
> SwiftScript trace: for, 2
> SwiftScript trace: for, 1
> Final status:  time:17
> Time: 1.227, rate: 13352 j/s
> com$ swift zz5.swift
> Swift svn swift-r4087 (swift modified locally) cog-r3051
> 
> RunID: 20110330-2148-tl2xtxx6
> Progress:  time:1
> SwiftScript trace: arr, arr.$[]/2
> SwiftScript trace: for, 1
> SwiftScript trace: for, 1
> SwiftScript trace: brr, brr.$[]/2
> SwiftScript trace: for, 2
> SwiftScript trace: for, 2
> Final status:  time:14
> Time: 1.224, rate: 13385 j/s
> com$ swift zz5.swift
> Swift svn swift-r4087 (swift modified locally) cog-r3051
> 
> RunID: 20110330-2148-mk5aypbg
> Progress:  time:6
> SwiftScript trace: arr, arr.$[]/2
> SwiftScript trace: brr, brr.$[]/2
> SwiftScript trace: for, 2
> SwiftScript trace: for, 1
> SwiftScript trace: for, 1
> SwiftScript trace: for, 2
> Final status:  time:17
> Time: 1.191, rate: 13756 j/s
> com$ swift zz5.swift
> Swift svn swift-r4087 (swift modified locally) cog-r3051
> 
> RunID: 20110330-2148-hgcbaxga
> Progress:  time:2
> SwiftScript trace: arr, arr.$[]/2
> SwiftScript trace: for, 1
> SwiftScript trace: for, 2
> SwiftScript trace: for, 2
> SwiftScript trace: for, 1
> com$ swift zz5.swift
> Swift svn swift-r4087 (swift modified locally) cog-r3051
> 
> RunID: 20110330-2149-oaa0kuy8
> Progress:SwiftScript trace: arr, arr.$[]/2
>   time:9
> SwiftScript trace: for, 2
> SwiftScript trace: brr, brr.$[]/2
> SwiftScript trace: for, 2
> SwiftScript trace: for, 1
> Final status:  time:17
> Time: 1.241, rate: 13202 j/s
> com$ 
> 
> 
> ----- Original Message -----
> > On Wed, 2011-03-30 at 20:37 -0500, Michael Wilde wrote:
> > > OK, what am I missing?
> > 
> > Nothing. That shouldn't be happening.
> > 
> > Here's what (approximately) you should get:
> > Swift svn swift-r3526 (swift modified locally) cog-r656 (cog modified
> > locally)
> > 
> > RunID: 20110330-1839-y71gls4a
> > Progress: time:0
> > [Misc] WARN pool-1-thread-4 - SwiftScript trace: for, 2
> > [Misc] WARN pool-1-thread-1 - SwiftScript trace: for, 1
> > Final status: time:54
> > 
> > >
> > > int arr[];
> > >
> > > arr[0]=1;
> > > arr[1]=2;
> > >
> > > foreach a in arr {
> > >   trace("for", a);
> > > }
> > >
> > > login1$ swift zz3.swift
> > > Swift svn swift-r4157 cog-r3056
> > >
> > > RunID: 20110331-0134-bfzhkgaa
> > > Progress:
> > > SwiftScript trace: for, 2
> > > SwiftScript trace: for, 1
> > > SwiftScript trace: for, 1
> > > SwiftScript trace: for, 2
> > > Final status:
> > >
> > > When did the foreach loop become the twice-each loop?
> > >
> > > I need to try some other revision and hosts with this.
> > >
> > > - Mike
> > >
> > >
> > > ----- Original Message -----
> > > > Wow, I didn't know we can do that! I treat the docs too
> > > > canonically :P
> > > >
> > > > 2011/3/30 Michael Wilde <wilde at mcs.anl.gov>:
> > > >
> > > > > login1$ cat zz2.swift
> > > > >
> > > > > foreach a in [0:3] {
> > > > >  trace("for", a);
> > > > > }
> > > > >
> > > > > login1$ swift zz2.swift
> > > > > Swift svn swift-r4157 cog-r3056
> > > > >
> > > > > RunID: 20110331-0057-huo8jei0
> > > > > Progress:
> > > > > SwiftScript trace: for, 1
> > > > > SwiftScript trace: for, 3
> > > > > SwiftScript trace: for, 2
> > > > > SwiftScript trace: for, 0
> > > > > Final status:
> > > > > login1$
> > > > >
> > > > > I suspect we need to make this more clear in the user guide and
> > > > > tutorials :)
> > > >
> > > > I agree.
> > > >
> > > > >
> > > > > - Mike
> > > > >
> > > > >
> > > > > ----- Original Message -----
> > > > >> Or just use the concurrent mapper to let swift handle the
> > > > >> output
> > > > >> naming itself. The resume files can't persist through multiple
> > > > >> sessions though.
> > > > >>
> > > > >> 2011/3/30 Michael Wilde <wilde at mcs.anl.gov>:
> > > > >> > The most common case for this error occurs when two
> > > > >> > iterations
> > > > >> > within a foreach loop map an output file to the same physical
> > > > >> > file
> > > > >> > name. When swift runs and tries to put the output object into
> > > > >> > its
> > > > >> > site cache, it sees that a file of the name name is already
> > > > >> > in
> > > > >> > the
> > > > >> > cache, and its semantics do not allow that.
> > > > >> >
> > > > >> > I have not yet stared at this code long enough to see if this
> > > > >> > explains what is happening here.
> > > > >> >
> > > > >> > I also dont know why it might work under one version and fail
> > > > >> > under
> > > > >> > 0.92. If the above situation is occurring, perhaps there is
> > > > >> > some
> > > > >> > randomness involved: loop iteration ordering; filename
> > > > >> > generation
> > > > >> > randomness or difference, etc.
> > > > >> >
> > > > >> > But I would debug with that in mind: make sure that all
> > > > >> > *output*
> > > > >> > fie
> > > > >> > names mapped by the script are unique. Ideally, one should be
> > > > >> > able
> > > > >> > to find the culprit by grepping the swift log for all the
> > > > >> > mapped
> > > > >> > file names and look for duplicates.
> > > > >> >
> > > > >> > - Mike
> > > > >> >
> > > > >> >
> > > > >> > ----- Original Message -----
> > > > >> >> Or maybe local variables are static? Maybe they mapped to
> > > > >> >> different
> > > > >> >> files but to the same cache object? But I have been doing
> > > > >> >> local
> > > > >> >> variables in my own workflows though.
> > > > >> >>
> > > > >> >> 2011/3/30 Jonathan Monette <jon.monette at gmail.com>:
> > > > >> >> > Ok. I understand this error better. But shouldn't that be
> > > > >> >> > a
> > > > >> >> > different
> > > > >> >> > error then? Like a and b are mapped to the same file? I
> > > > >> >> > don't
> > > > >> >> > know
> > > > >> >> > if Swift
> > > > >> >> > can know this but looking at the explanation and error it
> > > > >> >> > should
> > > > >> >> > unless this
> > > > >> >> > cache message has a deeper meaning.
> > > > >> >> >
> > > > >> >> > On Wed, Mar 30, 2011 at 6:21 PM, Allan Espinosa
> > > > >> >> > <aespinosa at cs.uchicago.edu>
> > > > >> >> > wrote:
> > > > >> >> >>
> > > > >> >> >> I had this error before when two output mapper objects
> > > > >> >> >> mapped
> > > > >> >> >> to
> > > > >> >> >> the same
> > > > >> >> >> file.
> > > > >> >> >>
> > > > >> >> >> $ swift bug_same.swift
> > > > >> >> >> Swift svn swift-r4208 cog-r3073
> > > > >> >> >>
> > > > >> >> >> RunID: 20110330-1818-ygec7ppa
> > > > >> >> >> Progress: time:0
> > > > >> >> >> The cache already contains
> > > > >> >> >> localhost:bug_same-20110330-1818-ygec7ppa/shared/foo.
> > > > >> >> >>
> > > > >> >> >> The cache already contains
> > > > >> >> >> localhost:bug_same-20110330-1818-ygec7ppa/shared/foo.
> > > > >> >> >>
> > > > >> >> >> Progress: time:1960 Stage in:1 Finished successfully:1
> > > > >> >> >> The cache already contains
> > > > >> >> >> localhost:bug_same-20110330-1818-ygec7ppa/shared/foo.
> > > > >> >> >>
> > > > >> >> >> [aespinosa at communicado testing]$
> > > > >> >> >> [aespinosa at communicado testing]$ cat bug_same.swift
> > > > >> >> >> type file;
> > > > >> >> >>
> > > > >> >> >> app (file out) echo(string input) {
> > > > >> >> >>  echo input stdout=@filename(out);
> > > > >> >> >> }
> > > > >> >> >>
> > > > >> >> >> file a <"foo">;
> > > > >> >> >> file b <"foo">;
> > > > >> >> >>
> > > > >> >> >> a = echo("hello world");
> > > > >> >> >> b = echo("foo bar");
> > > > >> >> >>
> > > > >> >> >> But i think you should be using other Swift mappers that
> > > > >> >> >> does
> > > > >> >> >> auto-numbering of files by default.
> > > > >> >> >>
> > > > >> >> >> -Allan
> > > > >> >> >>
> > > > >> >> >> 2011/3/30 Zhao Zhang <zhaozhang at uchicago.edu>:
> > > > >> >> >> > Hi guys,
> > > > >> >> >> >
> > > > >> >> >> > I am seeing something weird in swfit-0.92. Any idea
> > > > >> >> >> > about
> > > > >> >> >> > this?
> > > > >> >> >> > The swift script is very simple:
> > > > >> >> >> >
> > > > >> >> >> > zzhang at sandbox:~/workplace/Andrey> cat movies.swift
> > > > >> >> >> > type Pickle {}
> > > > >> >> >> > type History {}
> > > > >> >> >> > type Image {}
> > > > >> >> >> >
> > > > >> >> >> > app (History historyout) movie_graph (int rerun, int
> > > > >> >> >> > epochs,
> > > > >> >> >> > Pickle
> > > > >> >> >> > picklefile)
> > > > >> >> >> > {
> > > > >> >> >> >   movie_graph rerun epochs;
> > > > >> >> >> > }
> > > > >> >> >> >
> > > > >> >> >> > int arr[];
> > > > >> >> >> > iterate i
> > > > >> >> >> > {
> > > > >> >> >> >  arr[i] = i+1;
> > > > >> >> >> > }until(i == 1);
> > > > >> >> >> >
> > > > >> >> >> > int epochs;
> > > > >> >> >> > epochs = 3;
> > > > >> >> >> > Pickle picklefile <single_file_mapper;
> > > > >> >> >> > file="for_movies.pickled">;
> > > > >> >> >> > foreach a in arr{
> > > > >> >> >> >  History historyout <single_file_mapper;
> > > > >> >> >> >  file=@strcat("output/rerun", a,
> > > > >> >> >> > "/histories.pickled-", a)>;
> > > > >> >> >> >  historyout = movie_graph(a, epochs, picklefile);
> > > > >> >> >> > }
> > > > >> >> >> >
> > > > >> >> >> >
> > > > >> >> >> >
> > > > >> >> >> > I ran the script with the latest 0.92 version, which is
> > > > >> >> >> > loaded
> > > > >> >> >> > as
> > > > >> >> >> > a
> > > > >> >> >> > module
> > > > >> >> >> > on beagle. The I saw this:
> > > > >> >> >> > zzhang at sandbox:~/workplace/Andrey> swift -tc.file
> > > > >> >> >> > ./tc.data
> > > > >> >> >> > movies.swift
> > > > >> >> >> > Variable epochs defined in scope 99878388 shadows
> > > > >> >> >> > variable
> > > > >> >> >> > of
> > > > >> >> >> > same name
> > > > >> >> >> > in
> > > > >> >> >> > scope 1813605401
> > > > >> >> >> > Variable picklefile defined in scope 99878388 shadows
> > > > >> >> >> > variable
> > > > >> >> >> > of
> > > > >> >> >> > same
> > > > >> >> >> > name
> > > > >> >> >> > in scope 1813605401
> > > > >> >> >> > Swift svn swift-r4157 cog-r3056
> > > > >> >> >> >
> > > > >> >> >> > RunID: 20110330-1636-ev8vm8gb
> > > > >> >> >> > Progress:
> > > > >> >> >> > Progress: Selecting site:3 Active:1
> > > > >> >> >> > Progress: Selecting site:3 Checking status:1
> > > > >> >> >> > Progress: Selecting site:2 Stage in:1 Finished
> > > > >> >> >> > successfully:1
> > > > >> >> >> > Progress: Selecting site:2 Active:1 Finished
> > > > >> >> >> > successfully:1
> > > > >> >> >> > Progress: Selecting site:2 Active:1 Finished
> > > > >> >> >> > successfully:1
> > > > >> >> >> > Progress: Selecting site:1 Stage in:1 Finished
> > > > >> >> >> > successfully:2
> > > > >> >> >> > Progress: Selecting site:1 Active:1 Finished
> > > > >> >> >> > successfully:2
> > > > >> >> >> > Progress: Selecting site:1 Checking status:1 Finished
> > > > >> >> >> > successfully:2
> > > > >> >> >> > The cache already contains
> > > > >> >> >> >
> > > > >> >> >> > localhost:movies-20110330-1636-ev8vm8gb/shared/output/rerun1/histories.pickled-1.
> > > > >> >> >> >
> > > > >> >> >> > Execution failed:
> > > > >> >> >> >        The cache already contains
> > > > >> >> >> >
> > > > >> >> >> > localhost:movies-20110330-1636-ev8vm8gb/shared/output/rerun1/histories.pickled-1.
> > > > >> >> >> >
> > > > >> >> >> >
> > > > >> >> >> > Then I switched to an older version, it worked well.
> > > > >> >> >> > zzhang at sandbox:~/workplace/Andrey> swift -tc.file
> > > > >> >> >> > ./tc.data
> > > > >> >> >> > movies.swift
> > > > >> >> >> > Variable epochs defined in scope 212602028 shadows
> > > > >> >> >> > variable
> > > > >> >> >> > of
> > > > >> >> >> > same name
> > > > >> >> >> > in
> > > > >> >> >> > scope 1538939834
> > > > >> >> >> > Variable picklefile defined in scope 212602028 shadows
> > > > >> >> >> > variable
> > > > >> >> >> > of same
> > > > >> >> >> > name
> > > > >> >> >> > in scope 1538939834
> > > > >> >> >> > Swift svn swift-r3291 (swift modified locally)
> > > > >> >> >> > cog-r2750
> > > > >> >> >> > (cog
> > > > >> >> >> > modified
> > > > >> >> >> > locally)
> > > > >> >> >> >
> > > > >> >> >> > RunID: 20110330-1639-gmbyz1qa
> > > > >> >> >> > Progress:
> > > > >> >> >> > Progress: Active:2
> > > > >> >> >> > Progress: Active:1 Checking status:1
> > > > >> >> >> > Final status: Finished successfully:2
> > > > >> >> >> _______________________________________________
> > > > >
> > >
> 





More information about the Swift-devel mailing list