[Swift-devel] swift-on-ec2

Tue May 15 23:28:07 CDT 2007

First -- this is a very useful discussion, would it be possible to see 
all of it. We need to understand the requirements and trade-offs in some 
detail to figure out the best way to make this work. I see a few 
different interaction threads somewhat mixed up here though so just to 
make sure we are all on the same wavelength, here is some context.

Ian and I have been talking on and off about providing a workspace 
service implementation with EC2 backend. The benefit for that would be 
that users could deploy the same VMs using the same interface to either 
TeraPort or EC2 or yet another resource provider. The workspace service 
would also provide some features on top of EC2 (translating between PKI 
credentials and Amazon's paying accounts, contextualization as needed to 
make deployment dynamic). One application of interest for this was 
Swift. Last time we chatted about this though was in the context of 
using EC2 to provide a production platform for STAR runs (since 
virtualizing enough TeraPort to provide a production platform is taking 
a long time). This is what Tim and I are trying to make happen now.

Since there was also interest in running Swift in VMs, Mike, Tibi and I 
met around February/March and agreed that a reasonable way to proceed 
will be for us to stand up a base virtual cluster somewhere locally 
(e.g., a static deployment on TeraPort) so that they can finish the 
configuration according to their needs, look at performance, figure out 
the best way to interact with it, and make sure that there are no 
VM-induced gotchas. All of this will be much easier to assess locally 
and on a static deployment. Then we'd make sure the cluster is 
dynamically deployable using the workspace service (on EC2 or whatever 
other provider). During the meeting (and over following emails) we 
agreed that the required "base cluster" would be configured with 
GRAM/Torque on the headnode plus a number of worker nodes, plus root 
privileges. We configured this cluster and it is ready to deploy. Are 
you saying now that in fact something different is needed?

As Ian says, Borja and I were planning to meet with Ioan on Thursday to 
discuss interaction between Falkon and the workspace service (not 
necessarily/exclusively in the EC2 context). I don't completely 
understand the relationship between swift and falkon -- are there 
specific applications or scenarios that you are trying to target in this 
exercise?

Ioan Raicu wrote:
> Hi,
> See below:
> 
> Tim Freeman wrote:
>> On Tue, 15 May 2007 16:20:03 +0000 (GMT)
>> Ben Clifford <benc at hawaga.org.uk> wrote:
>>
>>  
>>> Ian asked about this elsewhere, but its perhaps interesting for 
>>> swift-devel people to look at the questions too.
>>>
>>> On Tue, 15 May 2007, Ian Foster wrote:
>>>
>>>    
>>>> Dear All:
>>>>       
>>>                                                                                 
>>>    
>>>> I asked Kate if she and Tim could look into creating VM images that 
>>>> would allow us to run Swift applications on Amazon EC2. I think Kate 
>>>> is meeting with Ioan about this on Thursday (?).
>>>>       
>>>                                                                                 
>>>    
>>>> One issue that I thought would be good to discuss is what we'd want 
>>>> in that VM image. Perhaps this is obvious to the rest of you, but it 
>>>> isn't to me. A few thoughts:
>>>>       * I'm assuming that we want to run "workers" on EC2 nodes, and 
>>>> have the
>>>> "task dispatch" logic run on some external frontend system outside EC2.
>>>>       * I would think that we want to use Falkon to do the task 
>>>> dispatch. If so,
>>>> we need a Falkon executor on each VM, configured to check in with 
>>>> the Falkon
>>>> dispatcher. (Alternatively, we could use, say, SGE: in that case, we 
>>>> would
>>>> want an SGE agent.)
>>>>       *  We need a way of getting data to and from the worker nodes. 
>>>> Do we want to
>>>> run a file system across the EC2 nodes and the external frontend 
>>>> node? That
>>>> seems rather inefficient. Other options?
>>>>       * Should we preload the application code on each EC2 node?
>>>>       
>>> Here's a couple of approaches:
>>>
>>>  1) swift regards all the EC2 nodes that we are paying for as a 
>>> single     site.
>>>
>>> Something like falkon handles all the task dispatch and worker node 
>>> management. I don't know what that looks like at the moment in 
>>> Falkon, but the interface for Swift to send jobs into Falkon sounds 
>>> pretty straightforward and shouldn't need changing.
>>>     
>>
>> So if I understand, here there would be no gateway+LRM but each EC2 
>> node +
>> Falkon would need a port open to receive tasks?  Or does each node 
>> pull down
>> instructions OK from behind a firewall?
>>   
> Falkon supports both polling and notifications.  To use notifications, 
> there needs to be an open port on the worker :(
>> Is there a latency problem with running each node as an indepdent task
>> receiver with the dispatcher off-site from EC2?  I would think it 
>> would be
>> better to put the queues to fill with tasks on EC2 so it can more 
>> quickly get
>> the task going when a node is done with a previous task (I may be 
>> missing some
>> nuances here with respect to Falkon, don't know much about this yet!).   
> We have run the Falkon dispatcher at UChicago and workers at ANL without 
> any issues, so it can easily tolerate a few ms of latency.  We haven't 
> tried it across 10s of ms of latency links, but my instinct says that if 
> you have enough workers, you might be able to hide the latency.  We'd 
> have to experiment with it to see what happens.  We could potentially do 
> some experiments between SDSC and ANL over a 50+ ms link, and see what 
> difference in throughputs we get.
> 
> Ioan
>> If a gateway node is desired, this option sounds a lot like the GRAM+LRM
>> situation we use on VMs with the workspace service and will soon use 
>> on EC2 via
>> the workspace EC2 gateway we're implementing.  Start up one gateway 
>> node and
>> then add compute nodes which dynamically join the pool, they are 
>> pointed to the
>> GRAM node.
>>
>>  
>>> All the nodes in a site are required by our site model to have a 
>>> shared filesystem - we've talked about removing it, but I think that 
>>> is still the case and if so, isn't going to change soon.     
>>
>> Setting up a shared filesystem in this environment is akin to setting 
>> up the
>> compute nodes to join an LRM pool.  The VMs can communicate over the 
>> private
>> network at EC2, you can instruct EC2 to let all the nodes be open to 
>> each other
>> (while simultaneously keeping a separate policy of blocking ports from 
>> being
>> open from the internet and other people's EC2 nodes).  The 
>> non-file-serving
>> nodes would simply need to know the private address of the filesystem 
>> server
>> (unless you are using a fancier network file system than NFS-style ones).
>> For background: every VM on EC2 currently gets a public address -- 
>> NAT'd to a
>> private address which is actually what the VM's one NIC is configured 
>> with.
>> There is a facility to open/forward specific network ports on the public
>> address to each VM either via a group policy or on a VM by VM basis.
>>
>> [...]  
>>> Amazon also has a storage cloud, alongside its compute cloud. I know 
>>> very little about that and have never thought about how it would fit 
>>> into the above (if at all). Maybe someone else knows more.
>>>     
>>
>> A VM template on EC2 is called an AMI which stands for Amazon Machine 
>> Image.
>> This is just a packaging thing but what it mostly means is that the VM is
>> stored on S3 and also registered into the EC2 system.
>>
>> When starting an instance of an AMI, the file is copied from S3 to the
>> hypervisor node (what we call propagation in the workspace service).  
>> After it
>> is used, this file is deleted (an option in the workspace service but 
>> there is
>> also an option to save it back with any changes). 
>> So the VMs are stored in S3 but anything that happens on them after being
>> started is lost unless you manually do something about it.
>>
>> As for free scratch space, you get a good amount per node, 140G.  But 
>> the node
>> could go down at any moment just like a physical resource.
>>
>> To harness S3 for safely persisting any data (or if you need more 
>> space) you
>> would need to actually run S3 clients on the VMs when they are run on 
>> EC2.  You
>> could alternatively mirror data between nodes assuming that all would 
>> not go
>> down at once.
>> The good thing is that you do not pay transfer costs between S3 and 
>> EC2 if you
>> chose to use S3 for big storage, you would only pay the "housing fees" 
>> so to
>> speak.
>> Tim
>> _______________________________________________
>> Swift-devel mailing list
>> Swift-devel at ci.uchicago.edu
>> http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
>>
>>   
> 

-- 

Kate Keahey,
Mathematics & CS Division, Argonne National Laboratory
Computation Institute, University of Chicago