[Swift-devel] problems with external dependencies

Michael Wilde wilde at mcs.anl.gov
Sun Mar 22 20:16:00 CDT 2009


This works, to replace trace():

(external w) wait(external waits[]) {
   external val[];
   iterate i {
      val[i] = waits[i];
   } until(i==9);
}

Apparently I can assign a external value to an array of them.
The assignment presumably waits until the source external value is set.

On 3/22/09 8:07 PM, Michael Wilde wrote:
> This approach looks promising: I generate an external token for each app 
>  I call, filling an array of external "wait" vars. Then I use iterate 
> and trace to walk through the array, waiting to make sure that each 
> element has "fired".
> 
> If I can replace the trace() with something fast and silent, and this 
> scales to a few thousand wait vars, I think it might work.
> 
> What I have now is this script:
> 
> type file;
> 
> app (external w, file o) echo (int i) { echo i stdout=@o; }
> 
> (external waits[], file r[]) generate() {
>   int j[] = [0:10];
>   foreach i in j {
>     (waits[i],r[i]) = echo(i*i);
>   }
> }
> 
> (external w) wait(external waits[]) {
>   iterate i {
>     trace("in wait: i", i, "wait", waits[i]);
>   } until(i==9); // FIXME
> }
> 
> app (file o) ls (string dir, external w) { ls "-l" dir stdout=@o; }
> 
> file datadir[]<simple_mapper;prefix="datadir/">;
> external waits[];
> 
> (waits,datadir) = generate();
> 
> external wf = wait(waits);
> 
> trace( "generate and wait done", wf);
> 
> file out <"ls.out">;
> out = ls("/home/wilde/oops/swift/datadir/", wf);
> 
> -- 
> 
> which gives:
> 
> Swift svn swift-r2724 (swift modified locally) cog-r2333
> 
> RunID: 20090322-2005-j3c5zk04
> Progress:
> Progress:  Selecting site:8 Stage in:1 Finished successfully:2
> SwiftScript trace: in wait: i, 0, wait, null
> Progress:  Selecting site:3 Stage in:1 Active:1 Finished successfully:6
> SwiftScript trace: in wait: i, 1, wait, null
> SwiftScript trace: in wait: i, 2, wait, null
> SwiftScript trace: in wait: i, 3, wait, null
> SwiftScript trace: in wait: i, 4, wait, null
> SwiftScript trace: in wait: i, 5, wait, null
> SwiftScript trace: in wait: i, 6, wait, null
> SwiftScript trace: in wait: i, 7, wait, null
> SwiftScript trace: in wait: i, 8, wait, null
> SwiftScript trace: in wait: i, 9, wait, null
> getValue called in an external dataset
> getValue called in an external dataset
> SwiftScript trace: generate and wait done, null <<<<<<<<
> Progress:  Stage out:1 Finished successfully:11
> Final status:  Finished successfully:12
> 
> -- 
>  and ls succeeds, showing all expected files in ls.out.
> 
> 
> On 3/22/09 7:38 PM, Michael Wilde wrote:
>> I got the example from my previous email working using this technique 
>> (passing the external var to trace). But a script that simulates more 
>> closely what I really need to do is still eluding me.
>>
>> In the real code, I need to wait till a set of nested procedures that 
>> involve nested foreach and iterate statements complete. So Im trying 
>> create a simple simulation of the needed synchronization with the 
>> following script:
>>
>> -- 
>> type file;
>>
>> app (file o) echo (int i) { echo i stdout=@o; }
>>
>> (file r[]) generate() {
>>   int j[] = [0:10];
>>   foreach i in j {
>>     r[i] = echo(i*i);
>>   }
>> }
>>
>> (external w) wait(file dir[]) {
>>   trace("in wait: dir",dir);
>> }
>>
>> app (file o) ls (string dir, external w) { ls "-l" dir stdout=@o; }
>>
>> file datadir[]<simple_mapper;prefix="datadir/">;
>> datadir = generate();
>>
>> external w1 = wait(datadir);
>>
>> trace( "generate done", w1);
>>
>> file out <"ls.out">;
>> out = ls("/home/wilde/oops/swift/datadir/", w1);
>> -- 
>>
>> In this script the proc "generate()" simulates the production of the 
>> data directory. I want the proc "ls" which simulates the processing of 
>> the data directory, to wait until the directory is produced. As the 
>> directory has too many files to pass to "ls" as an array, I pass a 
>> string with the dir's path to ls, and want external vars to cause it 
>> to wait till the directory is complete.
>>
>> But in the case above, returning the dataset (file array) "datadir" 
>> from generate() does not wait for the array to be "closed". Nor does 
>> passing it to wait(), nor does passing it by name to trace().  The 
>> script gives:
>>
>> -- 
>> Swift svn swift-r2724 (swift modified locally) cog-r2333
>>
>> RunID: 20090322-1922-o4ibjxac
>> Progress:
>>
>> SwiftScript trace: in wait: dir, 
>> org.griphyn.vdl.karajan.FuturePairIterator at 1e671e67
>>
>> SwiftScript trace: generate done, null
>>
>> Progress:  Selecting site:8 Active:1 Stage out:1 Failed:1 Finished 
>> successfully:1
>> Progress:  Selecting site:4 Active:1 Stage out:1 Failed:1 Finished 
>> successfully:5
>> Progress:  Active:1 Stage out:1 Failed:1 Finished successfully:9
>> Final status:  Failed:1 Finished successfully:11
>>
>> The following errors have occurred:
>> 1. Application "ls" failed (Exit code 2)
>>         Arguments: "-l, /home/wilde/oops/swift/datadir/"
>>         Host: localhost
>>         Directory: ex5-20090322-1922-o4ibjxac/jobs/l/ls-ljbsob8j
>>         STDERR: /bin/ls: /home/wilde/oops/swift/datadir/: No such file 
>> or directory
>>         STDOUT:
>> -- 
>>
>>
>> It seems there's only 2 kinds of constructs or behaviors that can give 
>> me this behavior, neither of which I can find a way to cause:
>> - something that waits for the whole array to get its values
>> - something that waits for an entire array of externals to all be set
>>
>> This note in the users guide suggests a possible way to do what I need:
>>
>> "Statements which deal with the array as a whole will often wait for 
>> the array to be closed before executing (thus, a closed array is the 
>> equivalent of a non-array type being assigned). However, a foreach 
>> statement will apply its body to elements of an array as they become 
>> known. It will not wait until the array is closed.
>>
>> What statement can I use to "wait for the array to be closed before 
>> executing"?
>>
>>
>> On 3/22/09 4:47 PM, Ben Clifford wrote:
>>> As far as I can tell from a brief poke around, this is what is 
>>> happening for you:
>>>
>>> Compound procedures do not themselves wait for their input parameters 
>>> to all be ready to use. instead, they start trying to run all 
>>> component pieces.
>>>
>>> If some data necessary for some component piece is not ready yet, 
>>> that component piece will wait, so the compound procedure doesn't 
>>> need to (and indeed shouldn't, because that reduces potential 
>>> parallelism in some cases)
>>>
>>> You say this:
>>>
>>> analyseDatabase(external i) 
>>> {                                                     trace("i am 
>>> analyseDatabase");                                                }  
>>> The trace call does not have any need to wait for i to be ready. So 
>>> it doesn't wait for i to be ready.
>>>
>>> If you say this:
>>>                                                                                 
>>> analyseDatabase(external i) {
>>>   trace("i am analyseDatabase", i);
>>> }
>>>
>>> then the trace call must wait for i to be ready (and fortuitously in 
>>> the present implementation doesn't explode even though i cannot be 
>>> meaningfully traced).
>>>
>>> With that change, you'll see the behaviour you want.
>>>
>> _______________________________________________
>> Swift-devel mailing list
>> Swift-devel at ci.uchicago.edu
>> http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
> _______________________________________________
> Swift-devel mailing list
> Swift-devel at ci.uchicago.edu
> http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel



More information about the Swift-devel mailing list