[Swift-user] Performance of Swift

J A jamalphd at gmail.com
Tue Jun 3 15:41:44 CDT 2008


Thank you all for your replies ...

I have a workflow that i developed using C code.  I am thinking of using
Swift to execute the workflow, so my thinking is that i need first to change
the code to be Swift script.

More info about my workflow:

The workflow consist of several major tasks:

Task 1:  create a 1000 uniquely strings where each string is 1000 bytes.
Task 2:  merge the strings  where every 2 strings ( A, B) will exchange a
segment of it at a certain point and produce 1 string (C) with the same
length (1000 bytes).  Then C replaces B.
Task 3:  duplicate the list of string so we will have now 2000 strings
Task 4: randomly choose 1000 strings from the current 2000 strings.
Task 5: repeat Tasks 2, 3, and 4   for N times  (N is given) and now the
list of strings used in the next iteration is the output of Task 4.


Do you think changing the whole program into Swift script is necessary or
just certain sections?  Can i just use wrappers around certain tasks and use
Swift Script to call these tasks?

Will the performance be the same?


Any suggestion will be really appreciated.

Thanks,
Jamal





On 6/3/08, Ioan Raicu <iraicu at cs.uchicago.edu> wrote:
>
> Hi,
> There are several papers out there from our group that shows different
> aspects of performance of Swift.  Here are a few:
>
>    -
>    http://people.cs.uchicago.edu/~iraicu/publications/2008_NOVA08_book-chapter_Swift.pdf<http://people.cs.uchicago.edu/%7Eiraicu/publications/2008_NOVA08_book-chapter_Swift.pdf>
>       - Figure 9: Shows the memory footprint per job (aka tasks, or nodes
>       in the DAG graph)
>          - shows memory footprint of 3.2KB per node
>       - Figure 18: Shows a large scale application
>          - 20K tasks on 200 CPUs with average task lengths of 200 seconds
>          is a comfortable range for Swift and Falkon
>          - we have more recent results, not published yet, that has 16K
>          tasks on 2048 CPUs with an average task length of 87 seconds which worked
>          well
>          -
>    http://people.cs.uchicago.edu/~iraicu/publications/2007_SWF07_Swift.pdf<http://people.cs.uchicago.edu/%7Eiraicu/publications/2007_SWF07_Swift.pdf>
>       - Figure 6: Shows the speedup achieved with different task lengths
>          - Conclusion is that using multi-level scheduling with the Falkon
>          provider, even tasks in the range of seconds long can achieve good speedup
>          - Figure 7: Shows the throughput in tasks/sec achieved by Swift
>          - shows Swift achieving 50+ tasks/sec throughputs using Falkon
>          - the paragraph right after this figure mentions that Swift
>          running directly with GRAM2 and PBS can achieve 2 jobs/sec; the implication
>          of this is that jobs typically take 15~60 seconds to startup, which reflects
>          the cost of scheduling, scheduling cycles, and local resource manager's
>          (LRM) time to setup the remote nodes; there are also limitations on how many
>          jobs can be submitted at a time, as each job queued might consume some
>          resources on the LRM, or there might be policies in place that limit the
>          number of jobs that can be queued; this means that aggressive throttling
>          must take place, which in practice, reduces the sustained rate that Swift
>          can submit/execute jobs to a single site, to even lower than 2 jobs/sec
>
> So, to answer you question, the performance of Swift (and any other
> workflow system) will heavily rely on how efficient you can dispatch
> jobs/tasks to remote resources, how long jobs/tasks are, how data intensive
> the application is, and how much data movement must happen before the job
> runs and after.  If you have a fast enough file system, and your application
> execution times are small, you can expect anywhere from 1 to 50 jobs/sec
> from Swift, depending on what technologies you use to interface between
> Swift and the remote resources (e.g. GRAM, PBS, Condor, Falkon, etc).
>
> Cheers,
> Ioan
>
>
> J A wrote:
>
>  Hi All:
>
>
> Based on my reading, the performance from execution a swift workflow
> depends on the parallelism that a workflow has.
>
>
>
> If I have a workflow that contains several processors where each processor
> (procedure) depends on the previous one (output of a processor "A" is the
> input for processors "B" and so on.)
>
>
>
> How the performance of using swift in this case compare to other systems
> that execute workflows where there isn't any parallelism in the workflow?
>
>
>
> --
> Thanks,
>
> Jamal
>
> ------------------------------
>
> _______________________________________________
> Swift-user mailing listSwift-user at ci.uchicago.eduhttp://mail.ci.uchicago.edu/mailman/listinfo/swift-user
>
>
> --
> ===================================================
> Ioan Raicu
> Ph.D. Candidate
> ===================================================
> Distributed Systems Laboratory
> Computer Science Department
> University of Chicago
> 1100 E. 58th Street, Ryerson Hall
> Chicago, IL 60637
> ===================================================
> Email: iraicu at cs.uchicago.edu
> Web:   http://www.cs.uchicago.edu/~iraicu <http://www.cs.uchicago.edu/%7Eiraicu>http://dev.globus.org/wiki/Incubator/Falkonhttp://dsl-wiki.cs.uchicago.edu/index.php/Main_Page
> ===================================================
> ===================================================
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/swift-user/attachments/20080603/4060fe51/attachment.html>


More information about the Swift-user mailing list