[Swift-user] Performance of Swift
J A
jamalphd at gmail.com
Sat Jun 7 05:18:29 CDT 2008
Thank you all for your replies ...
>
> I have a workflow that i developed using C code. I am thinking of using
> Swift to execute the workflow, so my thinking is that i need first to change
> the code to be Swift script.
>
> More info about my workflow:
>
> The workflow consist of several major tasks:
>
> Task 1: create a 1000 uniquely strings where each string is 1000 bytes.
> Task 2: merge the strings where every 2 strings ( A, B) will exchange a
> segment of it at a certain point and produce 1 string (C) with the same
> length (1000 bytes). Then C replaces B.
> Task 3: duplicate the list of string so we will have now 2000 strings
> Task 4: randomly choose 1000 strings from the current 2000 strings.
> Task 5: repeat Tasks 2, 3, and 4 for N times (N is given) and now the
> list of strings used in the next iteration is the output of Task 4.
>
>
> Do you think changing the whole program into Swift script is necessary or
> just certain sections? Can i just use wrappers around certain tasks and use
> Swift Script to call these tasks?
>
> Will the performance be the same?
>
>
> Any suggestion will be really appreciated.
>
> Thanks,
> Jamal
>
>
>
>
>
> On 6/3/08, Ioan Raicu <iraicu at cs.uchicago.edu> wrote:
>>
>> Hi,
>> There are several papers out there from our group that shows different
>> aspects of performance of Swift. Here are a few:
>>
>> -
>> http://people.cs.uchicago.edu/~iraicu/publications/2008_NOVA08_book-chapter_Swift.pdf<http://people.cs.uchicago.edu/%7Eiraicu/publications/2008_NOVA08_book-chapter_Swift.pdf>
>> - Figure 9: Shows the memory footprint per job (aka tasks, or nodes
>> in the DAG graph)
>> - shows memory footprint of 3.2KB per node
>> - Figure 18: Shows a large scale application
>> - 20K tasks on 200 CPUs with average task lengths of 200 seconds
>> is a comfortable range for Swift and Falkon
>> - we have more recent results, not published yet, that has 16K
>> tasks on 2048 CPUs with an average task length of 87 seconds which worked
>> well
>> -
>> http://people.cs.uchicago.edu/~iraicu/publications/2007_SWF07_Swift.pdf<http://people.cs.uchicago.edu/%7Eiraicu/publications/2007_SWF07_Swift.pdf>
>> - Figure 6: Shows the speedup achieved with different task lengths
>> - Conclusion is that using multi-level scheduling with the
>> Falkon provider, even tasks in the range of seconds long can achieve good
>> speedup
>> - Figure 7: Shows the throughput in tasks/sec achieved by Swift
>> - shows Swift achieving 50+ tasks/sec throughputs using Falkon
>> - the paragraph right after this figure mentions that Swift
>> running directly with GRAM2 and PBS can achieve 2 jobs/sec; the implication
>> of this is that jobs typically take 15~60 seconds to startup, which reflects
>> the cost of scheduling, scheduling cycles, and local resource manager's
>> (LRM) time to setup the remote nodes; there are also limitations on how many
>> jobs can be submitted at a time, as each job queued might consume some
>> resources on the LRM, or there might be policies in place that limit the
>> number of jobs that can be queued; this means that aggressive throttling
>> must take place, which in practice, reduces the sustained rate that Swift
>> can submit/execute jobs to a single site, to even lower than 2 jobs/sec
>>
>> So, to answer you question, the performance of Swift (and any other
>> workflow system) will heavily rely on how efficient you can dispatch
>> jobs/tasks to remote resources, how long jobs/tasks are, how data intensive
>> the application is, and how much data movement must happen before the job
>> runs and after. If you have a fast enough file system, and your application
>> execution times are small, you can expect anywhere from 1 to 50 jobs/sec
>> from Swift, depending on what technologies you use to interface between
>> Swift and the remote resources (e.g. GRAM, PBS, Condor, Falkon, etc).
>>
>> Cheers,
>> Ioan
>>
>>
>> J A wrote:
>>
>> Hi All:
>>
>>
>> Based on my reading, the performance from execution a swift workflow
>> depends on the parallelism that a workflow has.
>>
>>
>>
>> If I have a workflow that contains several processors where each processor
>> (procedure) depends on the previous one (output of a processor "A" is the
>> input for processors "B" and so on.)
>>
>>
>>
>> How the performance of using swift in this case compare to other systems
>> that execute workflows where there isn't any parallelism in the workflow?
>>
>>
>>
>> --
>> Thanks,
>>
>> Jamal
>>
>> ------------------------------
>>
>> _______________________________________________
>> Swift-user mailing listSwift-user at ci.uchicago.eduhttp://mail.ci.uchicago.edu/mailman/listinfo/swift-user
>>
>>
>> --
>> ===================================================
>> Ioan Raicu
>> Ph.D. Candidate
>> ===================================================
>> Distributed Systems Laboratory
>> Computer Science Department
>> University of Chicago
>> 1100 E. 58th Street, Ryerson Hall
>> Chicago, IL 60637
>> ===================================================
>> Email: iraicu at cs.uchicago.edu
>> Web: http://www.cs.uchicago.edu/~iraicu <http://www.cs.uchicago.edu/%7Eiraicu>http://dev.globus.org/wiki/Incubator/Falkonhttp://dsl-wiki.cs.uchicago.edu/index.php/Main_Page
>> ===================================================
>> ===================================================
>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/swift-user/attachments/20080607/48a89d31/attachment.html>
More information about the Swift-user
mailing list