[Swift-devel] Iterative PageRank in Swift

Michael Wilde wilde at mcs.anl.gov
Sun Jun 2 15:29:04 CDT 2013


I should clarify, that by "Is the partition() app coded to do this, including making the parent directories ./_concurrent/fn-139240b8-81cc-4b22-8088-aa5aedd98afe--array/ ?" I meant:

Is the partition() app coded to *return these exact 16 files*, including making the parent directories ./_concurrent/fn-139240b8-81cc-4b22-8088-aa5aedd98afe--array/ ?

I.e, partition() needs to look at its first command line arg $1 and create $2 (in this case, 16) filenames just like $1, including directories, relative to the current working dir $PWD.

Is it coded to do that?

- Mike

----- Original Message -----
> From: "Michael Wilde" <wilde at mcs.anl.gov>
> To: "ZHAO ZHANG" <zhaozhang at uchicago.edu>
> Cc: "swift-devel" <swift-devel at ci.uchicago.edu>
> Sent: Sunday, June 2, 2013 3:21:24 PM
> Subject: Re: [Swift-devel] Iterative PageRank in Swift
> 
> Zhao,
> 
> The immediate failure in the run below seems to be due to partition()
> not creating the following 16 files below its *work* directory:
> 
> _concurrent/fn-139240b8-81cc-4b22-8088-aa5aedd98afe--array//elt-3-0
> _concurrent/fn-139240b8-81cc-4b22-8088-aa5aedd98afe--array//elt-3-1
> ...
> _concurrent/fn-139240b8-81cc-4b22-8088-aa5aedd98afe--array//elt-3-15
> 
> Im assuming mapper.sh gets the name of an element of fn[], which has
> been mapped by the concurrent mapper, and returns an array mapping
> of the original name suffixed by -0 through -16?
> 
> Is that the expected behavior?
> 
> Is the partition() app coded to do this, including making the parent
> directories
> ./_concurrent/fn-139240b8-81cc-4b22-8088-aa5aedd98afe--array/ ?
> 
> I cant yet explain why/how this worked when you serialized the main
> (open code) function, but there's enough going on in this code that
> I'd carefully check the behavior of each stage.
> 
> - Mike
> 
> ----- Original Message -----
> > From: "Michael Wilde" <wilde at mcs.anl.gov>
> > To: "ZHAO ZHANG" <zhaozhang at uchicago.edu>
> > Cc: "swift-devel" <swift-devel at ci.uchicago.edu>
> > Sent: Sunday, June 2, 2013 2:48:58 PM
> > Subject: Re: [Swift-devel] Iterative PageRank in Swift
> > 
> > Zhao, Im studying this.  Can you post a copy of mapper.sh?
> > 
> > Can you put a copy on a local machine here (both failing and
> > working
> > version) that I can experiment with?
> > 
> > Thanks,
> > 
> > - Mike
> > 
> > 
> > ----- Original Message -----
> > > From: "ZHAO ZHANG" <zhaozhang at uchicago.edu>
> > > To: "Swift Devel" <swift-devel at ci.uchicago.edu>
> > > Sent: Sunday, June 2, 2013 2:12:30 PM
> > > Subject: [Swift-devel] Iterative PageRank in Swift
> > > 
> > > Dear all,
> > > 
> > > I have been working with my cousin on an iterative PageRank
> > > implementation with Swift for his graduation project. We now
> > > encounter an problem: we try to use "file fn[]" as intermediate
> > > data
> > > between two stages, however, it does not work well.
> > > 
> > > The app and stage definition looks like below:
> > > zhaozhang at bigben:/var/tmp/workplace$ cat PageRank-new.swift
> > > type file;
> > > 
> > > app (file t) distribution (file f, file s) {
> > >     distribution @filename(f) @filename(t) @filename(s);
> > > }
> > > 
> > > app (file t[]) partition (file f) {
> > >     partition @filename(f) "16";
> > > }
> > > 
> > > app (file t) aggregation (file f[]){
> > >     aggregation @filename(t) @filenames(f);
> > > }
> > > 
> > > app (file t) cat (file f[]){
> > >     cat @filenames(f) stdout=@filename(t);
> > > }
> > > 
> > > app (file t) sort (file f){
> > >     sort "-nrk 2" @filename(f) stdout=@filename(t);
> > > }
> > > 
> > > (file fn[])map(file input[], file score){
> > >    foreach f,i in input {
> > >       file c<regexp_mapper;
> > >          source=@f,
> > >          match="input/(.*)",
> > >          transform="temp/\\1">;
> > >       c = distribution(f, score);
> > >       fn[i] = c;
> > >    }
> > > }
> > > 
> > > (file matrix[][])shuffle(file fn[]){
> > >    foreach c, j in fn{
> > > 	file output[] <ext; exec="bin/mapper.sh", source=@filename(c),
> > > 	scale=16>;
> > > 	output = partition(c);
> > > 	foreach f, k in output{
> > > 		matrix[k][j] = output[k];
> > > 	}
> > >    }
> > > }
> > > 
> > > (file final)reduce(file matrix[][]){
> > >    file result[];
> > >    foreach fl, k in matrix{
> > >       file output <single_file_mapper;
> > >       file=@strcat("result/result-",
> > >       @toString(k))>;
> > >       output = aggregation(fl);
> > >       result[k] = output;
> > >    }
> > > 
> > >    final = cat(result);
> > > }
> > > 
> > > 
> > > If I write the main function as below, it does not work: it seems
> > > the
> > > intermediate files are not mapped to the expected file names.
> > > 
> > > //below are main function
> > > file input[] <filesys_mapper; location="input", prefix="links-">;
> > > file matrix[][];
> > > file fn[];
> > > 
> > > int loop=0;
> > > file score <single_file_mapper; file=@strcat("score.txt.",
> > > @toString(loop))>;
> > > file final <single_file_mapper;file=@strcat("score.txt.",
> > > @toString(loop+1))>;
> > > file sorted <single_file_mapper;file=@strcat("score.txt.",
> > > @toString(loop+1), ".sorted")>;
> > > 
> > > fn = map(input, score);
> > > matrix = shuffle(fn);
> > > final = reduce(matrix);
> > > sorted = sort(final);
> > > 
> > > The execution failed with the following message:
> > > Swift 0.94 swift-r6492 cog-r3658
> > > 
> > > RunID: 20130602-1348-yresjj56
> > > Progress:  time: Sun, 02 Jun 2013 13:48:49 -0500
> > > Progress:  time: Sun, 02 Jun 2013 13:48:51 -0500  Selecting
> > > site:3
> > >  Checking status:1
> > > Progress:  time: Sun, 02 Jun 2013 13:48:52 -0500  Selecting
> > > site:3
> > >  Checking status:1  Finished successfully:2
> > > Progress:  time: Sun, 02 Jun 2013 13:48:53 -0500  Selecting
> > > site:3
> > >  Checking status:1  Finished successfully:4
> > > Execution failed:
> > > 	Exception in partition:
> > >     Arguments: [temp/links-part-0001, 16]
> > >     Host: localhost
> > >     Directory:
> > >     PageRank-new-20130602-1348-yresjj56/jobs/i/partition-ishh5dal
> > >     stderr.txt:
> > >     stdout.txt:
> > > Caused by:
> > > 	The following output files were not created by the application:
> > > 	_concurrent/fn-139240b8-81cc-4b22-8088-aa5aedd98afe--array//elt-3-0,
> > > 	_concurrent/fn-139240b8-81cc-4b22-8088-aa5aedd98afe--array//elt-3-1,
> > > 	_concurrent/fn-139240b8-81cc-4b22-8088-aa5aedd98afe--array//elt-3-2,
> > > 	_concurrent/fn-139240b8-81cc-4b22-8088-aa5aedd98afe--array//elt-3-3,
> > > 	_concurrent/fn-139240b8-81cc-4b22-8088-aa5aedd98afe--array//elt-3-4,
> > > 	_concurrent/fn-139240b8-81cc-4b22-8088-aa5aedd98afe--array//elt-3-5,
> > > 	_concurrent/fn-139240b8-81cc-4b22-8088-aa5aedd98afe--array//elt-3-6,
> > > 	_concurrent/fn-139240b8-81cc-4b22-8088-aa5aedd98afe--array//elt-3-7,
> > > 	_concurrent/fn-139240b8-81cc-4b22-8088-aa5aedd98afe--array//elt-3-8,
> > > 	_concurrent/fn-139240b8-81cc-4b22-8088-aa5aedd98afe--array//elt-3-9,
> > > 	_concurrent/fn-139240b8-81cc-4b22-8088-aa5aedd98afe--array//elt-3-10,
> > > 	_concurrent/fn-139240b8-81cc-4b22-8088-aa5aedd98afe--array//elt-3-11,
> > > 	_concurrent/fn-139240b8-81cc-4b22-8088-aa5aedd98afe--array//elt-3-12,
> > > 	_concurrent/fn-139240b8-8
> > >  1cc-4b22-8088-aa5aedd98afe--array//elt-3-13,
> > >  _concurrent/fn-139240b8-81cc-4b22-8088-aa5aedd98afe--array//elt-3-14,
> > >  _concurrent/fn-139240b8-81cc-4b22-8088-aa5aedd98afe--array//elt-3-15
> > > 	partition, PageRank-new.swift, line 38
> > > 	shuffle, PageRank-new.swift, line 88
> > > 
> > > 
> > > 
> > > However, if I put the stages in Iterate control struct: it works.
> > > 
> > > //below are main function
> > > file input[] <filesys_mapper; location="input", prefix="links-">;
> > > file matrix[][];
> > > file fn[];
> > > 
> > > /*iterate loop{
> > >    iterate i{
> > >       if (i==0){
> > >          file score <single_file_mapper;
> > >          file=@strcat("score.txt.",
> > >          @toString(loop))>;
> > >          fn = map(input, score);
> > >       }
> > >       if(i==1){
> > >          matrix = shuffle(fn);
> > >       }
> > >       if(i==2){
> > >          file final
> > >          <single_file_mapper;file=@strcat("score.txt.",
> > >          @toString(loop+1))>;
> > >          final = reduce(matrix);
> > >          file sorted
> > >          <single_file_mapper;file=@strcat("score.txt.",
> > >          @toString(loop+1), ".sorted")>;
> > >          sorted = sort(final);
> > >       }
> > >    }until(i==3);
> > > }until(loop==1);*/
> > > 
> > > 
> > > I also checked SwiftMontage implementation, it was also written
> > > in
> > > this way, so I assumed the first draft should work some time ago.
> > > Is
> > > this a already known problem?
> > > 
> > > Best
> > > Zhao
> > > 
> > > _______________________________________________
> > > Swift-devel mailing list
> > > Swift-devel at ci.uchicago.edu
> > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel
> > > 
> > _______________________________________________
> > Swift-devel mailing list
> > Swift-devel at ci.uchicago.edu
> > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel
> > 
> _______________________________________________
> Swift-devel mailing list
> Swift-devel at ci.uchicago.edu
> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel
> 



More information about the Swift-devel mailing list