[Swift-devel] problems with external dependencies

Michael Wilde wilde at mcs.anl.gov
Sun Mar 22 20:07:26 CDT 2009


This approach looks promising: I generate an external token for each app 
  I call, filling an array of external "wait" vars. Then I use iterate 
and trace to walk through the array, waiting to make sure that each 
element has "fired".

If I can replace the trace() with something fast and silent, and this 
scales to a few thousand wait vars, I think it might work.

What I have now is this script:

type file;

app (external w, file o) echo (int i) { echo i stdout=@o; }

(external waits[], file r[]) generate() {
   int j[] = [0:10];
   foreach i in j {
     (waits[i],r[i]) = echo(i*i);
   }
}

(external w) wait(external waits[]) {
   iterate i {
     trace("in wait: i", i, "wait", waits[i]);
   } until(i==9); // FIXME
}

app (file o) ls (string dir, external w) { ls "-l" dir stdout=@o; }

file datadir[]<simple_mapper;prefix="datadir/">;
external waits[];

(waits,datadir) = generate();

external wf = wait(waits);

trace( "generate and wait done", wf);

file out <"ls.out">;
out = ls("/home/wilde/oops/swift/datadir/", wf);

--

which gives:

Swift svn swift-r2724 (swift modified locally) cog-r2333

RunID: 20090322-2005-j3c5zk04
Progress:
Progress:  Selecting site:8 Stage in:1 Finished successfully:2
SwiftScript trace: in wait: i, 0, wait, null
Progress:  Selecting site:3 Stage in:1 Active:1 Finished successfully:6
SwiftScript trace: in wait: i, 1, wait, null
SwiftScript trace: in wait: i, 2, wait, null
SwiftScript trace: in wait: i, 3, wait, null
SwiftScript trace: in wait: i, 4, wait, null
SwiftScript trace: in wait: i, 5, wait, null
SwiftScript trace: in wait: i, 6, wait, null
SwiftScript trace: in wait: i, 7, wait, null
SwiftScript trace: in wait: i, 8, wait, null
SwiftScript trace: in wait: i, 9, wait, null
getValue called in an external dataset
getValue called in an external dataset
SwiftScript trace: generate and wait done, null <<<<<<<<
Progress:  Stage out:1 Finished successfully:11
Final status:  Finished successfully:12

--
  and ls succeeds, showing all expected files in ls.out.


On 3/22/09 7:38 PM, Michael Wilde wrote:
> I got the example from my previous email working using this technique 
> (passing the external var to trace). But a script that simulates more 
> closely what I really need to do is still eluding me.
> 
> In the real code, I need to wait till a set of nested procedures that 
> involve nested foreach and iterate statements complete. So Im trying 
> create a simple simulation of the needed synchronization with the 
> following script:
> 
> -- 
> type file;
> 
> app (file o) echo (int i) { echo i stdout=@o; }
> 
> (file r[]) generate() {
>   int j[] = [0:10];
>   foreach i in j {
>     r[i] = echo(i*i);
>   }
> }
> 
> (external w) wait(file dir[]) {
>   trace("in wait: dir",dir);
> }
> 
> app (file o) ls (string dir, external w) { ls "-l" dir stdout=@o; }
> 
> file datadir[]<simple_mapper;prefix="datadir/">;
> datadir = generate();
> 
> external w1 = wait(datadir);
> 
> trace( "generate done", w1);
> 
> file out <"ls.out">;
> out = ls("/home/wilde/oops/swift/datadir/", w1);
> -- 
> 
> In this script the proc "generate()" simulates the production of the 
> data directory. I want the proc "ls" which simulates the processing of 
> the data directory, to wait until the directory is produced. As the 
> directory has too many files to pass to "ls" as an array, I pass a 
> string with the dir's path to ls, and want external vars to cause it to 
> wait till the directory is complete.
> 
> But in the case above, returning the dataset (file array) "datadir" from 
> generate() does not wait for the array to be "closed". Nor does passing 
> it to wait(), nor does passing it by name to trace().  The script gives:
> 
> -- 
> Swift svn swift-r2724 (swift modified locally) cog-r2333
> 
> RunID: 20090322-1922-o4ibjxac
> Progress:
> 
> SwiftScript trace: in wait: dir, 
> org.griphyn.vdl.karajan.FuturePairIterator at 1e671e67
> 
> SwiftScript trace: generate done, null
> 
> Progress:  Selecting site:8 Active:1 Stage out:1 Failed:1 Finished 
> successfully:1
> Progress:  Selecting site:4 Active:1 Stage out:1 Failed:1 Finished 
> successfully:5
> Progress:  Active:1 Stage out:1 Failed:1 Finished successfully:9
> Final status:  Failed:1 Finished successfully:11
> 
> The following errors have occurred:
> 1. Application "ls" failed (Exit code 2)
>         Arguments: "-l, /home/wilde/oops/swift/datadir/"
>         Host: localhost
>         Directory: ex5-20090322-1922-o4ibjxac/jobs/l/ls-ljbsob8j
>         STDERR: /bin/ls: /home/wilde/oops/swift/datadir/: No such file 
> or directory
>         STDOUT:
> -- 
> 
> 
> It seems there's only 2 kinds of constructs or behaviors that can give 
> me this behavior, neither of which I can find a way to cause:
> - something that waits for the whole array to get its values
> - something that waits for an entire array of externals to all be set
> 
> This note in the users guide suggests a possible way to do what I need:
> 
> "Statements which deal with the array as a whole will often wait for the 
> array to be closed before executing (thus, a closed array is the 
> equivalent of a non-array type being assigned). However, a foreach 
> statement will apply its body to elements of an array as they become 
> known. It will not wait until the array is closed.
> 
> What statement can I use to "wait for the array to be closed before 
> executing"?
> 
> 
> On 3/22/09 4:47 PM, Ben Clifford wrote:
>> As far as I can tell from a brief poke around, this is what is 
>> happening for you:
>>
>> Compound procedures do not themselves wait for their input parameters 
>> to all be ready to use. instead, they start trying to run all 
>> component pieces.
>>
>> If some data necessary for some component piece is not ready yet, that 
>> component piece will wait, so the compound procedure doesn't need to 
>> (and indeed shouldn't, because that reduces potential parallelism in 
>> some cases)
>>
>> You say this:
>>
>> analyseDatabase(external i) 
>> {                                                     trace("i am 
>> analyseDatabase");                                                }  
>> The trace call does not have any need to wait for i to be ready. So it 
>> doesn't wait for i to be ready.
>>
>> If you say this:
>>                                                                                 
>> analyseDatabase(external i) {
>>   trace("i am analyseDatabase", i);
>> }
>>
>> then the trace call must wait for i to be ready (and fortuitously in 
>> the present implementation doesn't explode even though i cannot be 
>> meaningfully traced).
>>
>> With that change, you'll see the behaviour you want.
>>
> _______________________________________________
> Swift-devel mailing list
> Swift-devel at ci.uchicago.edu
> http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel



More information about the Swift-devel mailing list