[Swift-devel] Iterative PageRank in Swift
ZHAO ZHANG
zhaozhang at uchicago.edu
Sun Jun 2 14:12:30 CDT 2013
Dear all,
I have been working with my cousin on an iterative PageRank implementation with Swift for his graduation project. We now encounter an problem: we try to use "file fn[]" as intermediate data between two stages, however, it does not work well.
The app and stage definition looks like below:
zhaozhang at bigben:/var/tmp/workplace$ cat PageRank-new.swift
type file;
app (file t) distribution (file f, file s) {
distribution @filename(f) @filename(t) @filename(s);
}
app (file t[]) partition (file f) {
partition @filename(f) "16";
}
app (file t) aggregation (file f[]){
aggregation @filename(t) @filenames(f);
}
app (file t) cat (file f[]){
cat @filenames(f) stdout=@filename(t);
}
app (file t) sort (file f){
sort "-nrk 2" @filename(f) stdout=@filename(t);
}
(file fn[])map(file input[], file score){
foreach f,i in input {
file c<regexp_mapper;
source=@f,
match="input/(.*)",
transform="temp/\\1">;
c = distribution(f, score);
fn[i] = c;
}
}
(file matrix[][])shuffle(file fn[]){
foreach c, j in fn{
file output[] <ext; exec="bin/mapper.sh", source=@filename(c), scale=16>;
output = partition(c);
foreach f, k in output{
matrix[k][j] = output[k];
}
}
}
(file final)reduce(file matrix[][]){
file result[];
foreach fl, k in matrix{
file output <single_file_mapper; file=@strcat("result/result-", @toString(k))>;
output = aggregation(fl);
result[k] = output;
}
final = cat(result);
}
If I write the main function as below, it does not work: it seems the intermediate files are not mapped to the expected file names.
//below are main function
file input[] <filesys_mapper; location="input", prefix="links-">;
file matrix[][];
file fn[];
int loop=0;
file score <single_file_mapper; file=@strcat("score.txt.", @toString(loop))>;
file final <single_file_mapper;file=@strcat("score.txt.", @toString(loop+1))>;
file sorted <single_file_mapper;file=@strcat("score.txt.", @toString(loop+1), ".sorted")>;
fn = map(input, score);
matrix = shuffle(fn);
final = reduce(matrix);
sorted = sort(final);
The execution failed with the following message:
Swift 0.94 swift-r6492 cog-r3658
RunID: 20130602-1348-yresjj56
Progress: time: Sun, 02 Jun 2013 13:48:49 -0500
Progress: time: Sun, 02 Jun 2013 13:48:51 -0500 Selecting site:3 Checking status:1
Progress: time: Sun, 02 Jun 2013 13:48:52 -0500 Selecting site:3 Checking status:1 Finished successfully:2
Progress: time: Sun, 02 Jun 2013 13:48:53 -0500 Selecting site:3 Checking status:1 Finished successfully:4
Execution failed:
Exception in partition:
Arguments: [temp/links-part-0001, 16]
Host: localhost
Directory: PageRank-new-20130602-1348-yresjj56/jobs/i/partition-ishh5dal
stderr.txt:
stdout.txt:
Caused by:
The following output files were not created by the application: _concurrent/fn-139240b8-81cc-4b22-8088-aa5aedd98afe--array//elt-3-0, _concurrent/fn-139240b8-81cc-4b22-8088-aa5aedd98afe--array//elt-3-1, _concurrent/fn-139240b8-81cc-4b22-8088-aa5aedd98afe--array//elt-3-2, _concurrent/fn-139240b8-81cc-4b22-8088-aa5aedd98afe--array//elt-3-3, _concurrent/fn-139240b8-81cc-4b22-8088-aa5aedd98afe--array//elt-3-4, _concurrent/fn-139240b8-81cc-4b22-8088-aa5aedd98afe--array//elt-3-5, _concurrent/fn-139240b8-81cc-4b22-8088-aa5aedd98afe--array//elt-3-6, _concurrent/fn-139240b8-81cc-4b22-8088-aa5aedd98afe--array//elt-3-7, _concurrent/fn-139240b8-81cc-4b22-8088-aa5aedd98afe--array//elt-3-8, _concurrent/fn-139240b8-81cc-4b22-8088-aa5aedd98afe--array//elt-3-9, _concurrent/fn-139240b8-81cc-4b22-8088-aa5aedd98afe--array//elt-3-10, _concurrent/fn-139240b8-81cc-4b22-8088-aa5aedd98afe--array//elt-3-11, _concurrent/fn-139240b8-81cc-4b22-8088-aa5aedd98afe--array//elt-3-12, _concurrent/fn-139240b8-81cc-4b22-8088-aa5aedd98afe--array//elt-3-13, _concurrent/fn-139240b8-81cc-4b22-8088-aa5aedd98afe--array//elt-3-14, _concurrent/fn-139240b8-81cc-4b22-8088-aa5aedd98afe--array//elt-3-15
partition, PageRank-new.swift, line 38
shuffle, PageRank-new.swift, line 88
However, if I put the stages in Iterate control struct: it works.
//below are main function
file input[] <filesys_mapper; location="input", prefix="links-">;
file matrix[][];
file fn[];
/*iterate loop{
iterate i{
if (i==0){
file score <single_file_mapper; file=@strcat("score.txt.", @toString(loop))>;
fn = map(input, score);
}
if(i==1){
matrix = shuffle(fn);
}
if(i==2){
file final <single_file_mapper;file=@strcat("score.txt.", @toString(loop+1))>;
final = reduce(matrix);
file sorted <single_file_mapper;file=@strcat("score.txt.", @toString(loop+1), ".sorted")>;
sorted = sort(final);
}
}until(i==3);
}until(loop==1);*/
I also checked SwiftMontage implementation, it was also written in this way, so I assumed the first draft should work some time ago. Is this a already known problem?
Best
Zhao
More information about the Swift-devel
mailing list