[Swift-user] Performance of Swift

Sat Jun 7 05:18:29 CDT 2008

Thank you all for your replies ...

>
> I have a workflow that i developed using C code.  I am thinking of using
> Swift to execute the workflow, so my thinking is that i need first to change
> the code to be Swift script.
>
> More info about my workflow:
>
> The workflow consist of several major tasks:
>
> Task 1:  create a 1000 uniquely strings where each string is 1000 bytes.
> Task 2:  merge the strings  where every 2 strings ( A, B) will exchange a
> segment of it at a certain point and produce 1 string (C) with the same
> length (1000 bytes).  Then C replaces B.
> Task 3:  duplicate the list of string so we will have now 2000 strings
> Task 4: randomly choose 1000 strings from the current 2000 strings.
> Task 5: repeat Tasks 2, 3, and 4   for N times  (N is given) and now the
> list of strings used in the next iteration is the output of Task 4.
>
>
> Do you think changing the whole program into Swift script is necessary or
> just certain sections?  Can i just use wrappers around certain tasks and use
> Swift Script to call these tasks?
>
> Will the performance be the same?
>
>
> Any suggestion will be really appreciated.
>
> Thanks,
> Jamal
>
>
>
>
>
> On 6/3/08, Ioan Raicu <iraicu at cs.uchicago.edu> wrote:
>>
>> Hi,
>> There are several papers out there from our group that shows different
>> aspects of performance of Swift.  Here are a few:
>>
>>    -
>>    http://people.cs.uchicago.edu/~iraicu/publications/2008_NOVA08_book-chapter_Swift.pdf<http://people.cs.uchicago.edu/%7Eiraicu/publications/2008_NOVA08_book-chapter_Swift.pdf>
>>       - Figure 9: Shows the memory footprint per job (aka tasks, or nodes
>>       in the DAG graph)
>>          - shows memory footprint of 3.2KB per node
>>       - Figure 18: Shows a large scale application
>>          - 20K tasks on 200 CPUs with average task lengths of 200 seconds
>>          is a comfortable range for Swift and Falkon
>>          - we have more recent results, not published yet, that has 16K
>>          tasks on 2048 CPUs with an average task length of 87 seconds which worked
>>          well
>>          -
>>    http://people.cs.uchicago.edu/~iraicu/publications/2007_SWF07_Swift.pdf<http://people.cs.uchicago.edu/%7Eiraicu/publications/2007_SWF07_Swift.pdf>
>>       - Figure 6: Shows the speedup achieved with different task lengths
>>          - Conclusion is that using multi-level scheduling with the
>>          Falkon provider, even tasks in the range of seconds long can achieve good
>>          speedup
>>          - Figure 7: Shows the throughput in tasks/sec achieved by Swift
>>          - shows Swift achieving 50+ tasks/sec throughputs using Falkon
>>          - the paragraph right after this figure mentions that Swift
>>          running directly with GRAM2 and PBS can achieve 2 jobs/sec; the implication
>>          of this is that jobs typically take 15~60 seconds to startup, which reflects
>>          the cost of scheduling, scheduling cycles, and local resource manager's
>>          (LRM) time to setup the remote nodes; there are also limitations on how many
>>          jobs can be submitted at a time, as each job queued might consume some
>>          resources on the LRM, or there might be policies in place that limit the
>>          number of jobs that can be queued; this means that aggressive throttling
>>          must take place, which in practice, reduces the sustained rate that Swift
>>          can submit/execute jobs to a single site, to even lower than 2 jobs/sec
>>
>> So, to answer you question, the performance of Swift (and any other
>> workflow system) will heavily rely on how efficient you can dispatch
>> jobs/tasks to remote resources, how long jobs/tasks are, how data intensive
>> the application is, and how much data movement must happen before the job
>> runs and after.  If you have a fast enough file system, and your application
>> execution times are small, you can expect anywhere from 1 to 50 jobs/sec
>> from Swift, depending on what technologies you use to interface between
>> Swift and the remote resources (e.g. GRAM, PBS, Condor, Falkon, etc).
>>
>> Cheers,
>> Ioan
>>
>>
>> J A wrote:
>>
>>  Hi All:
>>
>>
>> Based on my reading, the performance from execution a swift workflow
>> depends on the parallelism that a workflow has.
>>
>>
>>
>> If I have a workflow that contains several processors where each processor
>> (procedure) depends on the previous one (output of a processor "A" is the
>> input for processors "B" and so on.)
>>
>>
>>
>> How the performance of using swift in this case compare to other systems
>> that execute workflows where there isn't any parallelism in the workflow?
>>
>>
>>
>> --
>> Thanks,
>>
>> Jamal
>>
>> ------------------------------
>>
>> _______________________________________________
>> Swift-user mailing listSwift-user at ci.uchicago.eduhttp://mail.ci.uchicago.edu/mailman/listinfo/swift-user
>>
>>
>> --
>> ===================================================
>> Ioan Raicu
>> Ph.D. Candidate
>> ===================================================
>> Distributed Systems Laboratory
>> Computer Science Department
>> University of Chicago
>> 1100 E. 58th Street, Ryerson Hall
>> Chicago, IL 60637
>> ===================================================
>> Email: iraicu at cs.uchicago.edu
>> Web:   http://www.cs.uchicago.edu/~iraicu <http://www.cs.uchicago.edu/%7Eiraicu>http://dev.globus.org/wiki/Incubator/Falkonhttp://dsl-wiki.cs.uchicago.edu/index.php/Main_Page
>> ===================================================
>> ===================================================
>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/swift-user/attachments/20080607/48a89d31/attachment.html>