[Swift-user] swiftscript tricks on bypassing stage-ins (was Re: Helping Yi with Swift)

Allan Espinosa aespinosa at cs.uchicago.edu
Wed Aug 5 17:36:11 CDT 2009


instead of,

type file;


file inputA <"061-cattwo.1.in">;
file inputB <"061-cattwo.2.in">;

file output <"061-cattwo.out">;

(file t) cat(file m, file n) {
    app {
        cat @filename(m) @filename(n) stdout=@filename(t);
    }
}

output = cat(inputA, inputB);

where files inputA and inputB will get staged you can change the app
function to simply accept string parameters and no data transfer will
occur at all (except for "file t"):

type file;


file inputA <"/home/USER/workflows/nostaging/061-cattwo.1.in">;
file inputB <"/home/USER/workflows/nostaging/061-cattwo.2.in">;

file output <"061-cattwo.out">;

(file t) cat(string m, string n) {
    app {
        cat m n stdout=@filename(t);
    }
}

output = cat(@strcat("/", at filename(inputA)), @strcat("/", @filename(inputB)));


this quick hack requires absolute path names though

2009/8/5 Michael Wilde <wilde at mcs.anl.gov>:
> Yi,
>
> You should always send your swift questions to swift-user.
>
> Allan would be your prime helper for your initial questions; I am sure
> Mihael will try to help when he can, but he's got other prime
> responsibilities at the moment.
>
> Allan is away at workshops this week and next but will be reading mail.
>
> I dont expect to be in mail much on vacation, but if I see a question I'll
> do my best.
>
> Your somewhat on your own for the next week, OK?
>
> - Mike
>
>
>
>
> On 8/5/09 3:34 PM, Yi Zhu wrote:
>>
>> Hi, Michael
>>
>> About the section 2 in last message, could you suggest me someone who can
>> provide me support for swift during your vacation? I may need to make a some
>> modification on swift source code so I may need the structure of swift
>> system and technical support.
>>
>>
>> Many Thanks.
>>
>> -Yi
>>
>> Yi Zhu wrote:
>>>
>>> Hi Ian,
>>>
>>> I think I've found the performance issues in my last experiment,
>>> generally, it because of Long Tail Effect, since the Total Execution Time is
>>> calculated by the first job has been submitted until the last job finished,
>>> when the execution approach to the end, there are some nodes is idling
>>> because there are not enough job in the queue to run. When the rate of
>>> (number of nodes/total number of jobs) is high, this problem effect more.
>>> Therefore, In our first experiment, there are 100 jobs to run and the long
>>> tail problem effect a lot at the "100 nodes" test, so that the performance
>>> is not good as we expected.
>>>
>>> I've put the details on the wiki site:
>>> http://dsl-wiki.cs.uchicago.edu/index.php/Performance_Comparison:Remote_Usage%2C_NFS%2C_S3-fuse%2C_EBS
>>> (see the bottom)
>>>
>>> 2.
>>>
>>> In a traditional swift usage, data need to be transfer to remotely site
>>> before run ( trade-in), and transfer back after finished (trade-out). the
>>> remote side  does not do a directly access to users's computer because they
>>> may not have reliable network or  there is  potential delay&jitter during
>>> network transmission.  So, use the same traditional way when data is stored
>>> on S3 may not be the optimum solution. Since there is a reliable connection
>>> between S3 and EC2, we could let working node directly access the data on S3
>>> bucket rather than trade-in before execution and trade-out after
>>> execution_done.
>>>
>>> Since this includes modify the source code of swift, Mike, can we arrange
>>> a  discuss about that on Tomorrow?
>>>


-- 
Allan M. Espinosa <http://allan.88-mph.net/blog>
PhD student, Computer Science
University of Chicago <http://people.cs.uchicago.edu/~aespinosa>



More information about the Swift-user mailing list