[Swift-devel] Iterative PageRank in Swift

ZHAO ZHANG zhaozhang at uchicago.edu
Sun Jun 2 20:12:24 CDT 2013


Thanks Mike. The problem has been solved. The reason is that the output files of partition app were not properly mapped.

best
zhao

On Jun 2, 2013, at 12:12 PM, ZHAO ZHANG wrote:

> Dear all,
> 
> I have been working with my cousin on an iterative PageRank implementation with Swift for his graduation project. We now encounter an problem: we try to use "file fn[]" as intermediate data between two stages, however, it does not work well. 
> 
> The app and stage definition looks like below:
> zhaozhang at bigben:/var/tmp/workplace$ cat PageRank-new.swift
> type file; 
> 
> app (file t) distribution (file f, file s) {   
>    distribution @filename(f) @filename(t) @filename(s);
> }
> 
> app (file t[]) partition (file f) {   
>    partition @filename(f) "16";
> }
> 
> app (file t) aggregation (file f[]){
>    aggregation @filename(t) @filenames(f);
> }
> 
> app (file t) cat (file f[]){
>    cat @filenames(f) stdout=@filename(t);
> }
> 
> app (file t) sort (file f){
>    sort "-nrk 2" @filename(f) stdout=@filename(t);
> }
> 
> (file fn[])map(file input[], file score){
>   foreach f,i in input {
>      file c<regexp_mapper;
>         source=@f,
>         match="input/(.*)",
>         transform="temp/\\1">;
>      c = distribution(f, score);  
>      fn[i] = c;
>   }
> }
> 
> (file matrix[][])shuffle(file fn[]){
>   foreach c, j in fn{
> 	file output[] <ext; exec="bin/mapper.sh", source=@filename(c), scale=16>;
> 	output = partition(c);
> 	foreach f, k in output{
> 		matrix[k][j] = output[k];
> 	}
>   }
> }
> 
> (file final)reduce(file matrix[][]){
>   file result[];	  
>   foreach fl, k in matrix{
>      file output <single_file_mapper; file=@strcat("result/result-", @toString(k))>;
>      output = aggregation(fl);
>      result[k] = output;
>   }   
> 
>   final = cat(result);
> }
> 
> 
> If I write the main function as below, it does not work: it seems the intermediate files are not mapped to the expected file names.
> 
> //below are main function
> file input[] <filesys_mapper; location="input", prefix="links-">;
> file matrix[][];
> file fn[];
> 
> int loop=0;
> file score <single_file_mapper; file=@strcat("score.txt.", @toString(loop))>;
> file final <single_file_mapper;file=@strcat("score.txt.", @toString(loop+1))>;
> file sorted <single_file_mapper;file=@strcat("score.txt.", @toString(loop+1), ".sorted")>;
> 
> fn = map(input, score);
> matrix = shuffle(fn);
> final = reduce(matrix);
> sorted = sort(final);
> 
> The execution failed with the following message:
> Swift 0.94 swift-r6492 cog-r3658
> 
> RunID: 20130602-1348-yresjj56
> Progress:  time: Sun, 02 Jun 2013 13:48:49 -0500
> Progress:  time: Sun, 02 Jun 2013 13:48:51 -0500  Selecting site:3  Checking status:1
> Progress:  time: Sun, 02 Jun 2013 13:48:52 -0500  Selecting site:3  Checking status:1  Finished successfully:2
> Progress:  time: Sun, 02 Jun 2013 13:48:53 -0500  Selecting site:3  Checking status:1  Finished successfully:4
> Execution failed:
> 	Exception in partition:
>    Arguments: [temp/links-part-0001, 16]
>    Host: localhost
>    Directory: PageRank-new-20130602-1348-yresjj56/jobs/i/partition-ishh5dal
>    stderr.txt: 
>    stdout.txt: 
> Caused by:
> 	The following output files were not created by the application: _concurrent/fn-139240b8-81cc-4b22-8088-aa5aedd98afe--array//elt-3-0, _concurrent/fn-139240b8-81cc-4b22-8088-aa5aedd98afe--array//elt-3-1, _concurrent/fn-139240b8-81cc-4b22-8088-aa5aedd98afe--array//elt-3-2, _concurrent/fn-139240b8-81cc-4b22-8088-aa5aedd98afe--array//elt-3-3, _concurrent/fn-139240b8-81cc-4b22-8088-aa5aedd98afe--array//elt-3-4, _concurrent/fn-139240b8-81cc-4b22-8088-aa5aedd98afe--array//elt-3-5, _concurrent/fn-139240b8-81cc-4b22-8088-aa5aedd98afe--array//elt-3-6, _concurrent/fn-139240b8-81cc-4b22-8088-aa5aedd98afe--array//elt-3-7, _concurrent/fn-139240b8-81cc-4b22-8088-aa5aedd98afe--array//elt-3-8, _concurrent/fn-139240b8-81cc-4b22-8088-aa5aedd98afe--array//elt-3-9, _concurrent/fn-139240b8-81cc-4b22-8088-aa5aedd98afe--array//elt-3-10, _concurrent/fn-139240b8-81cc-4b22-8088-aa5aedd98afe--array//elt-3-11, _concurrent/fn-139240b8-81cc-4b22-8088-aa5aedd98afe--array//elt-3-12, _concurrent/fn-139240b8-8
> 1cc-4b22-8088-aa5aedd98afe--array//elt-3-13, _concurrent/fn-139240b8-81cc-4b22-8088-aa5aedd98afe--array//elt-3-14, _concurrent/fn-139240b8-81cc-4b22-8088-aa5aedd98afe--array//elt-3-15
> 	partition, PageRank-new.swift, line 38
> 	shuffle, PageRank-new.swift, line 88
> 
> 
> 
> However, if I put the stages in Iterate control struct: it works.
> 
> //below are main function
> file input[] <filesys_mapper; location="input", prefix="links-">;
> file matrix[][];
> file fn[];
> 
> /*iterate loop{
>   iterate i{
>      if (i==0){
>         file score <single_file_mapper; file=@strcat("score.txt.", @toString(loop))>;
>         fn = map(input, score);
>      }
>      if(i==1){
>         matrix = shuffle(fn);
>      }
>      if(i==2){
>         file final <single_file_mapper;file=@strcat("score.txt.", @toString(loop+1))>;
>         final = reduce(matrix);
>         file sorted <single_file_mapper;file=@strcat("score.txt.", @toString(loop+1), ".sorted")>;
>         sorted = sort(final);
>      }
>   }until(i==3);
> }until(loop==1);*/
> 
> 
> I also checked SwiftMontage implementation, it was also written in this way, so I assumed the first draft should work some time ago. Is this a already known problem?
> 
> Best
> Zhao
> 
> _______________________________________________
> Swift-devel mailing list
> Swift-devel at ci.uchicago.edu
> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel




More information about the Swift-devel mailing list