[Swift-devel] swift-on-ec2

Thu May 17 11:10:16 CDT 2007


Kate Keahey wrote:
>
>
> Ian Foster wrote:
>> Kate:
>>
>> I want to emphasize that I was *not* dismissing the issues below as 
>> distractions.
>>
>> What I meant was: given that you are working on developing a "virtual 
>> cluster", which I am pretty sure will be able to execute Swift apps, 
>> let's focus on getting that done, rather than worrying about "special 
>> casing" it for Falkon, adding dynamic node acquisition, or the other 
>> things that people started discussing as potential extensions.
>
> We only now really began to discuss how to use VMs with Swift/Falkon 
> -- the original set of issues you posted was just what was needed, it 
> clearly inspired a very good discussion, and made me realize that I 
> should have been talking to a wider set of people about this. Please, 
> don't go back on us now... It also looks to me like there may be 
> solutions that will make more sense both from the perspective of the 
> architecture and will also be easier to implement with the current 
> state of virtualization tools. For example, if we can set up Falkon to 
> provision single nodes operating in pull mode (pulling work from a 
> "master") various contextualization issues will have become much easier.
>
>>
>> I understand from our IM conversation today that the "virtual 
>> cluster" is ready for us in a "static environment" such as some 
>> machines in our lab. In a "dynamic environment" such as EC2, it is 
>> not quite ready for use yet. Thus, you won't be able to get Swift 
>> running on EC2 tomorrow.
>
> This is not quite accurate; static refers to statically assigned IPs 
> -- we have control over our IPs and can assign them to the cluster 
> nodes in the same way each time we deploy it. Amazon will choose new 
> IPs for the nodes each time the cluster is deployed, so each time the 
> configuration of the cluster will have to be adjusted to reflect 
> different IP assignment to the nodes (but if we were to change the IPs 
> on the cluster nodes in a local environment we would be just as dynamic).
>
> But if you deploy just one node (e.g., a node operating in the pull 
> mode as in the example above) the need for this configuration 
> adjustment may go away (depending on what the node does) so everything 
> may become much simpler.
Currently, a Falkon executor (the worker code) upon bootstrapping, makes 
1 WS call to the Falkon dispatcher (running in a GT4 container) to 
register its name and the port on which the notification engine is 
listening on.  Once this is done, the executors go into a listen mode 
for notifications, and only acts (send WS calls out) upon the reception 
of notifications.  So, the VMs that run the Falkon executors can get 
DHCP addresses, and the registration message will include all the 
necessary information about where the Falkon dispatcher needs to contact 
the respective Falkon executor!  Now, the one configuration parameter 
that we must have is the location of the Falkon dispatcher.  If we have 
it running in a static location (a well known machine and port), then 
this can be hard coded into the bootstrapping scripts, and there is no 
configuration needed!  If the dispatcher does not have a static resource 
to run on (i.e. it runs in another VM), then this information needs to 
be passed to the executor bootstrapping scripts! 

Ioan
>
> We can spend some time looking at deploying a VM on EC2 if it is of 
> interest (as well as deploying a VM via the workspace service if that 
> is of interest), we can run things on the deployed VM, etc. But I 
> *strongly* argue that we spend at least some time defining what we 
> want from this project, what is realistic to have in the short-term, 
> what will be hard/impossible/inconvenient and try to build it 
> systematically. Then we can figure out who does what and by when this 
> is going to be done.
>
>
>>
>> Ian.
>>
>>
>> Kate Keahey wrote:
>>> Ian,
>>>
>>> you seem to be referring to the necessary /etc/hosts configuration 
>>> as well as workers registering with the torque headnode below as 
>>> "distractions" -- I agree they can be very distracting, but in my 
>>> experience without these distractions a cluster (virtual or 
>>> physical) won't work in the way such clusters are typically expected 
>>> to work.
>>>
>>> What I said in my mail is that we can set up a base cluster locally 
>>> so that somebody like Ioan can finish the configuration (i.e., 
>>> install Falkon on it). We will configure this cluster once and leave 
>>> it deployed  as long as needed.
>>>
>>> Once we have the front-end to EC2 working (which we don't have yet 
>>> although we are close) we will deploy this cluster on EC2 and 
>>> provide methods that will automate this last little bit of 
>>> configuration that *always* has to be done on deployment.
>>>
>>> I also think it is quite important that we spend the time tomorrow 
>>> discussing what exactly we are trying to do -- right now, it looks 
>>> to me like it might make more sense to not use clusters (it will 
>>> help with the "distractions" if we don't).
>>>
>>> I realize that you are eager for us to get things to run -- I am 
>>> eager too, but I honestly think we will get there faster if we plan 
>>> better.
>>>
>>> Ian Foster wrote:
>>>> Kate:
>>>>
>>>> I personally will be delighted if you could run the virtual cluster 
>>>> on ec2 tomorrow. I know that there are lots of ways that you could 
>>>> refine its config, local expts that could be performed, etc., but 
>>>> perhaps we could try bypassing those things, which seem somewhat 
>>>> like distractions to me?
>>>>
>>>> Ian
>>>>
>>>>
>>>> Sent via BlackBerry from T-Mobile -----Original Message-----
>>>> From: Kate Keahey <keahey at mcs.anl.gov>
>>>> Date: Wed, 16 May 2007 09:24:02 To:itf at mcs.anl.gov
>>>> Cc:swift-devel-bounces at ci.uchicago.edu, Ioan Raicu 
>>>> <iraicu at cs.uchicago.edu>,  swift-devel at ci.uchicago.edu, Borja 
>>>> Sotomayor <borja at borjanet.com>
>>>> Subject: Re: [Swift-devel] swift-on-ec2
>>>>
>>>>
>>>>
>>>> Ian Foster wrote:
>>>>> Kate:
>>>>>
>>>>> If we configure the virtual cluster with a full LRM, as you 
>>>>> propose (and it seems have already done--great work!), then we can 
>>>>> use this to start Falkon executors--as we do today on regular 
>>>>> clusters. So it seems to me that we have all we need. How about 
>>>>> you and Ioan spend your time on Thursday running something on EC2, 
>>>>> to make sure it sorks?
>>>>
>>>> As I suggest below, I think it would be easiest if we could deploy 
>>>> and debug a small static cluster locally first, and we can probably 
>>>> give it a shot tomorrow. We still don't have access to the Xen 
>>>> nodes on TeraPort (although hopefully that might change by 
>>>> tomorrow) but I asked Rick to rebuild a couple of nodes at ANL and 
>>>> he did, I think for a test that should give us enough resources to 
>>>> play with.
>>>>
>>>> At the same time -- if there are multiple ways of doing this, and 
>>>> perhaps better ways than simply using a virtual cluster, we should 
>>>> discuss them now. It is not completely clear to me what the 
>>>> relationship between Falkon and Swift is, and what the specific 
>>>> objectives are (other than that dynamically provisioning resources 
>>>> is required). It looks at this point like the objectives probably 
>>>> overlap with what Ioan, Borja and I wanted to talk about (which I 
>>>> thought was a separate project, but am thrilled to find out is 
>>>> related) so how about we come up with a design tomorrow and post 
>>>> the notes on this list (is this a good venue for that?) and then 
>>>> others can shoot them down.
>>>>
>>>>> Regarding choice of LRM: have you looked at SGE? That is what 
>>>>> quite a few others seem to be using.
>>>>
>>>> Yes, we have. We also collaborate with others who do, as well as 
>>>> with Sun... As you may remember, Borja did the scheduling work for 
>>>> his thesis in the context of SGE. Last time we talked though, 
>>>> Torque was the scheduler of choice for the virtual cluster LRM so 
>>>> we used that.
>>>>
>>>> The usage of SGE you are referring to above -- is this in the 
>>>> context of virtualization projects, or as LRM for various 
>>>> Falkon-related applications?
>>>>
>>>>> Ian
>>>>>
>>>>>
>>>>>
>>>>> Sent via BlackBerry from T-Mobile -----Original Message-----
>>>>> From: Kate Keahey <keahey at mcs.anl.gov>
>>>>> Date: Tue, 15 May 2007 23:28:07 To:iraicu at cs.uchicago.edu
>>>>> Cc:swift-devel at ci.uchicago.edu
>>>>> Subject: Re: [Swift-devel] swift-on-ec2
>>>>>
>>>>> First -- this is a very useful discussion, would it be possible to 
>>>>> see all of it. We need to understand the requirements and 
>>>>> trade-offs in some detail to figure out the best way to make this 
>>>>> work. I see a few different interaction threads somewhat mixed up 
>>>>> here though so just to make sure we are all on the same 
>>>>> wavelength, here is some context.
>>>>>
>>>>> Ian and I have been talking on and off about providing a workspace 
>>>>> service implementation with EC2 backend. The benefit for that 
>>>>> would be that users could deploy the same VMs using the same 
>>>>> interface to either TeraPort or EC2 or yet another resource 
>>>>> provider. The workspace service would also provide some features 
>>>>> on top of EC2 (translating between PKI credentials and Amazon's 
>>>>> paying accounts, contextualization as needed to make deployment 
>>>>> dynamic). One application of interest for this was Swift. Last 
>>>>> time we chatted about this though was in the context of using EC2 
>>>>> to provide a production platform for STAR runs (since virtualizing 
>>>>> enough TeraPort to provide a production platform is taking a long 
>>>>> time). This is what Tim and I are trying to make happen now.
>>>>>
>>>>> Since there was also interest in running Swift in VMs, Mike, Tibi 
>>>>> and I met around February/March and agreed that a reasonable way 
>>>>> to proceed will be for us to stand up a base virtual cluster 
>>>>> somewhere locally (e.g., a static deployment on TeraPort) so that 
>>>>> they can finish the configuration according to their needs, look 
>>>>> at performance, figure out the best way to interact with it, and 
>>>>> make sure that there are no VM-induced gotchas. All of this will 
>>>>> be much easier to assess locally and on a static deployment. Then 
>>>>> we'd make sure the cluster is dynamically deployable using the 
>>>>> workspace service (on EC2 or whatever other provider). During the 
>>>>> meeting (and over following emails) we agreed that the required 
>>>>> "base cluster" would be configured with GRAM/Torque on the 
>>>>> headnode plus a number of worker nodes, plus root privileges. We 
>>>>> configured this cluster and it is ready to deploy. Are you saying 
>>>>> now that in fact something different is needed?
>>>>>
>>>>> As Ian says, Borja and I were planning to meet with Ioan on 
>>>>> Thursday to discuss interaction between Falkon and the workspace 
>>>>> service (not necessarily/exclusively in the EC2 context). I don't 
>>>>> completely understand the relationship between swift and falkon -- 
>>>>> are there specific applications or scenarios that you are trying 
>>>>> to target in this exercise?
>>>>>
>>>>> Ioan Raicu wrote:
>>>>>> Hi,
>>>>>> See below:
>>>>>>
>>>>>> Tim Freeman wrote:
>>>>>>> On Tue, 15 May 2007 16:20:03 +0000 (GMT)
>>>>>>> Ben Clifford <benc at hawaga.org.uk> wrote:
>>>>>>>
>>>>>>>  
>>>>>>>> Ian asked about this elsewhere, but its perhaps interesting for 
>>>>>>>> swift-devel people to look at the questions too.
>>>>>>>>
>>>>>>>> On Tue, 15 May 2007, Ian Foster wrote:
>>>>>>>>
>>>>>>>>  
>>>>>>>>> Dear All:
>>>>>>>>>       
>>>>>>>>                                                                                 
>>>>>>>>  
>>>>>>>>> I asked Kate if she and Tim could look into creating VM images 
>>>>>>>>> that would allow us to run Swift applications on Amazon EC2. I 
>>>>>>>>> think Kate is meeting with Ioan about this on Thursday (?).
>>>>>>>>>       
>>>>>>>>                                                                                 
>>>>>>>>  
>>>>>>>>> One issue that I thought would be good to discuss is what we'd 
>>>>>>>>> want in that VM image. Perhaps this is obvious to the rest of 
>>>>>>>>> you, but it isn't to me. A few thoughts:
>>>>>>>>>       * I'm assuming that we want to run "workers" on EC2 
>>>>>>>>> nodes, and have the
>>>>>>>>> "task dispatch" logic run on some external frontend system 
>>>>>>>>> outside EC2.
>>>>>>>>>       * I would think that we want to use Falkon to do the 
>>>>>>>>> task dispatch. If so,
>>>>>>>>> we need a Falkon executor on each VM, configured to check in 
>>>>>>>>> with the Falkon
>>>>>>>>> dispatcher. (Alternatively, we could use, say, SGE: in that 
>>>>>>>>> case, we would
>>>>>>>>> want an SGE agent.)
>>>>>>>>>       *  We need a way of getting data to and from the worker 
>>>>>>>>> nodes. Do we want to
>>>>>>>>> run a file system across the EC2 nodes and the external 
>>>>>>>>> frontend node? That
>>>>>>>>> seems rather inefficient. Other options?
>>>>>>>>>       * Should we preload the application code on each EC2 node?
>>>>>>>>>       
>>>>>>>> Here's a couple of approaches:
>>>>>>>>
>>>>>>>>  1) swift regards all the EC2 nodes that we are paying for as a 
>>>>>>>> single     site.
>>>>>>>>
>>>>>>>> Something like falkon handles all the task dispatch and worker 
>>>>>>>> node management. I don't know what that looks like at the 
>>>>>>>> moment in Falkon, but the interface for Swift to send jobs into 
>>>>>>>> Falkon sounds pretty straightforward and shouldn't need changing.
>>>>>>>>     
>>>>>>> So if I understand, here there would be no gateway+LRM but each 
>>>>>>> EC2 node +
>>>>>>> Falkon would need a port open to receive tasks?  Or does each 
>>>>>>> node pull down
>>>>>>> instructions OK from behind a firewall?
>>>>>>>   
>>>>>> Falkon supports both polling and notifications.  To use 
>>>>>> notifications, there needs to be an open port on the worker :(
>>>>>>> Is there a latency problem with running each node as an 
>>>>>>> indepdent task
>>>>>>> receiver with the dispatcher off-site from EC2?  I would think 
>>>>>>> it would be
>>>>>>> better to put the queues to fill with tasks on EC2 so it can 
>>>>>>> more quickly get
>>>>>>> the task going when a node is done with a previous task (I may 
>>>>>>> be missing some
>>>>>>> nuances here with respect to Falkon, don't know much about this 
>>>>>>> yet!).   
>>>>>> We have run the Falkon dispatcher at UChicago and workers at ANL 
>>>>>> without any issues, so it can easily tolerate a few ms of 
>>>>>> latency.  We haven't tried it across 10s of ms of latency links, 
>>>>>> but my instinct says that if you have enough workers, you might 
>>>>>> be able to hide the latency.  We'd have to experiment with it to 
>>>>>> see what happens.  We could potentially do some experiments 
>>>>>> between SDSC and ANL over a 50+ ms link, and see what difference 
>>>>>> in throughputs we get.
>>>>>>
>>>>>> Ioan
>>>>>>> If a gateway node is desired, this option sounds a lot like the 
>>>>>>> GRAM+LRM
>>>>>>> situation we use on VMs with the workspace service and will soon 
>>>>>>> use on EC2 via
>>>>>>> the workspace EC2 gateway we're implementing.  Start up one 
>>>>>>> gateway node and
>>>>>>> then add compute nodes which dynamically join the pool, they are 
>>>>>>> pointed to the
>>>>>>> GRAM node.
>>>>>>>
>>>>>>>  
>>>>>>>> All the nodes in a site are required by our site model to have 
>>>>>>>> a shared filesystem - we've talked about removing it, but I 
>>>>>>>> think that is still the case and if so, isn't going to change 
>>>>>>>> soon.     
>>>>>>> Setting up a shared filesystem in this environment is akin to 
>>>>>>> setting up the
>>>>>>> compute nodes to join an LRM pool.  The VMs can communicate over 
>>>>>>> the private
>>>>>>> network at EC2, you can instruct EC2 to let all the nodes be 
>>>>>>> open to each other
>>>>>>> (while simultaneously keeping a separate policy of blocking 
>>>>>>> ports from being
>>>>>>> open from the internet and other people's EC2 nodes).  The 
>>>>>>> non-file-serving
>>>>>>> nodes would simply need to know the private address of the 
>>>>>>> filesystem server
>>>>>>> (unless you are using a fancier network file system than 
>>>>>>> NFS-style ones).
>>>>>>> For background: every VM on EC2 currently gets a public address 
>>>>>>> -- NAT'd to a
>>>>>>> private address which is actually what the VM's one NIC is 
>>>>>>> configured with.
>>>>>>> There is a facility to open/forward specific network ports on 
>>>>>>> the public
>>>>>>> address to each VM either via a group policy or on a VM by VM 
>>>>>>> basis.
>>>>>>>
>>>>>>> [...]
>>>>>>>> Amazon also has a storage cloud, alongside its compute cloud. I 
>>>>>>>> know very little about that and have never thought about how it 
>>>>>>>> would fit into the above (if at all). Maybe someone else knows 
>>>>>>>> more.
>>>>>>>>     
>>>>>>> A VM template on EC2 is called an AMI which stands for Amazon 
>>>>>>> Machine Image.
>>>>>>> This is just a packaging thing but what it mostly means is that 
>>>>>>> the VM is
>>>>>>> stored on S3 and also registered into the EC2 system.
>>>>>>>
>>>>>>> When starting an instance of an AMI, the file is copied from S3 
>>>>>>> to the
>>>>>>> hypervisor node (what we call propagation in the workspace 
>>>>>>> service).  After it
>>>>>>> is used, this file is deleted (an option in the workspace 
>>>>>>> service but there is
>>>>>>> also an option to save it back with any changes). So the VMs are 
>>>>>>> stored in S3 but anything that happens on them after being
>>>>>>> started is lost unless you manually do something about it.
>>>>>>>
>>>>>>> As for free scratch space, you get a good amount per node, 
>>>>>>> 140G.  But the node
>>>>>>> could go down at any moment just like a physical resource.
>>>>>>>
>>>>>>> To harness S3 for safely persisting any data (or if you need 
>>>>>>> more space) you
>>>>>>> would need to actually run S3 clients on the VMs when they are 
>>>>>>> run on EC2.  You
>>>>>>> could alternatively mirror data between nodes assuming that all 
>>>>>>> would not go
>>>>>>> down at once.
>>>>>>> The good thing is that you do not pay transfer costs between S3 
>>>>>>> and EC2 if you
>>>>>>> chose to use S3 for big storage, you would only pay the "housing 
>>>>>>> fees" so to
>>>>>>> speak.
>>>>>>> Tim
>>>>>>> _______________________________________________
>>>>>>> Swift-devel mailing list
>>>>>>> Swift-devel at ci.uchicago.edu
>>>>>>> http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
>>>>>>>
>>>>>>>   
>>>>
>>>
>>
>

-- 
============================================
Ioan Raicu
Ph.D. Student
============================================
Distributed Systems Laboratory
Computer Science Department
University of Chicago
1100 E. 58th Street, Ryerson Hall
Chicago, IL 60637
============================================
Email: iraicu at cs.uchicago.edu
Web:   http://www.cs.uchicago.edu/~iraicu
       http://dsl.cs.uchicago.edu/
============================================
============================================