[Swift-devel] Data-aware scheduling in Swift ?

Emalayan Vairavanathan svemalayan at yahoo.com
Fri Mar 30 19:10:22 CDT 2012


Hi Matei,

I am currently working on evaluating the gains of locality on BG/P. I will be able to get some numbers today / tomorrow. This will help us in taking decisions.

Thank you
Emalayan.




________________________________
 From: Matei Ripeanu <matei.ripeanu at gmail.com>
To: 'Emalayan Vairavanathan' <svemalayan at yahoo.com>; mosastore at googlegroups.com; 'matei' <matei at ece.ubc.ca> 
Cc: swift-devel at ci.uchicago.edu 
Sent: Friday, 30 March 2012 4:04 AM
Subject: RE: [Swift-devel] Data-aware scheduling in Swift ?
 

Emalayan, Mike, Justin, all,
 
There are a number of points worth discussing before we fully embark into this: 
 
First:  We need to better understand what the gains we expect to have on BG/P from locality.  We know we have sizeable gains on our cluster with data stored on disk (and where we have much lower cross-section bandwidth).  I expect that most of these gains are preserved when we use RAM disks on our cluster. And will stay there as long as we do not have to transfer huge volumes of data.  Unfortunately we can test this only with 20 nodes - I have no good intuition about what will happen o BG/P at large scale.
 
Second: We should discuss how key is having this feature on Swift on BG/P for all the other points we want to prove for the paper.   I think support for only one of the patterns we look at to optimize with the cross-layer communication can be demonstrated without (e.g., the one for broadcast) while the other two (pipelines and gather) can not.     On the other side, is there a way to run our benchmark scripts on BG/P  (I guess not) to demonstrate the potential gains if Swift implemented that? Or can we run (some of) the applications  without Swift on our cluster?
 
Third:  I am afraid getting functionality this into Swift/Coasters is quite some work.  On the other side Mike suggests a relatively clear implementation path. (It will probably work for pipelines but I’m not sure it will work for ‘gather’)
 
What I suggest:  Let’s discuss between ourselves three things before embarking into changing Swift/Coasters:  (1) we want to increase the certainty that we’ll see performance gains if we implement this,  (2) see whether there aren’t ways to demonstrate (some of) what we  want outside Swift; (3) re-evaluate the schedule and priorities – we have roughly four weeks to the deadline.
 
Let me know what you think,
 
-Matei   
 
 
 
 
 
 
 
 
 
From:Emalayan Vairavanathan [mailto:svemalayan at yahoo.com] 
Sent: March-29-12 7:06 PM
To: mosastore at googlegroups.com; matei
Cc: swift-devel at ci.uchicago.edu
Subject: Re: [Swift-devel] Data-aware scheduling in Swift ?
 
Thank you Jon, Mike and Justin.
 
Having this functionality would be really useful for us to demonstrate how useful extended attributes are in MosaStore in long term. Further for our SC paper this is a critical functionality and we need this to support both pipeline and reduce patters.
 
Mike:I will be happy to help with this. In terms of effort and priories, how much time we need to spend to get this done? Is it feasible to target this for our SC paper ? 
 
Justin: We do have numbers for the difference between a local MosaStore access and a remote access on our cluster. This is what we have published in CCGrid 2012 (I have attached the paper). But we do not have numbers on BG/P. I can try it on BG/P and get back to you.
 
Matei:Do you have any suggestion ?
 
Thank you
Emalayan
 

________________________________

From:Michael Wilde <wilde at mcs.anl.gov>
To: Emalayan Vairavanathan <svemalayan at yahoo.com> 
Cc: MosaStore <mosastore at googlegroups.com>; swift-devel at ci.uchicago.edu 
Sent: Thursday, 29 March 2012 9:15 AM
Subject: Re: [Swift-devel] Data-aware scheduling in Swift ?

Swift will place an app() call on any free node. (As Jon just replied, while I was writing this...)

If we want to do an experiment with some kind of data affinity, we can try the following hack:

- Stage-A returns the node that it ran on
- swift script passes that as an arg "preferredNode(nodeName) to Stage-B
- scheduler tries to place Stage-B on the coaster named nodeName.

Its that last part thats the trickiest, as this will require a mod to the scheduler. And it gets trickier if the scheduler needs to try to defer Stage-B until nodeName can take a new job.  It *might* be easier, in a first pass, to only place STage-B on nodeName if nodeName has a free job slot, else to place it anywhere.

But all of this will require going into the coaster scheduler code.

I suggest we do this as a joint effort; I can try, with help from Mihael and Justin, to locate the code that we'd need to modify, if you are willing to do some experiments and hacking.

- Mike


----- Original Message -----
> From: "Emalayan Vairavanathan" <svemalayan at yahoo.com>
> To: swift-devel at ci.uchicago.edu
> Cc: "MosaStore" <mosastore at googlegroups.com>
> Sent: Thursday, March 29, 2012 10:59:41 AM
> Subject: [Swift-devel] Data-aware scheduling in Swift ?
> Hi All,
> 
> 
> I have a question about how swift schedules computations.
> 
> 
> Suppose there are two computation stages namely Stage-A and Stage-B in
> an application. Stage-A produces the data and Stage-B consumes the
> data . Could you please tell me how swift schedules these
> computations? Does it schedules Stage-A and Stage-B on the same node
> or on multiple nodes?
> Is it possible to configure the swift to schedules these computations
> on the same node (or is this the default behavior of swift ) ?
> 
> 
> 
> 
> Thank you
> Emalayan
> 
> 
> 
> _______________________________________________
> Swift-devel mailing list
> Swift-devel at ci.uchicago.edu
> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel

-- 
Michael Wilde
Computation Institute, University of Chicago
Mathematics and Computer Science Division
Argonne National Laboratory

-- 
You received this message because you are subscribed to the Google Groups "MosaStore" group.
To post to this group, send email to mosastore at googlegroups.com.
To unsubscribe from this group, send email to mosastore+unsubscribe at googlegroups.com.
For more options, visit this group at http://groups.google.com/group/mosastore?hl=en.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/swift-devel/attachments/20120330/8e129a77/attachment.html>


More information about the Swift-devel mailing list