[Swift-devel] Swift-issues (PBS+NFS Cluster)

yizhu yizhu at cs.uchicago.edu
Mon May 11 18:08:36 CDT 2009


Hi,

I've now got the swift system running on Amazon EC2 with swift installed 
on head-node of the cluster. I will soon let it the user submit the job 
from its own  machine after i solve the authentication issues of Globus.


I think my next step is to write a sample swift code to check if we can 
let swift grab input files from S3, execute it, and then write the 
output files back to S3.


Since each Amazon virtual node only has limited storage space( Small 
instance for 1.7GB, Large Instance for 7.5GB), we may need to use 
EBS(Elastic Block Store) to storage temp files created by swift. The EBS 
behaved like a hard disk volume and can be mounted by any virtual nodes. 
But here arise a problem, since a volume can only be attached to one 
instance at a time[1], so the files store in EBS mounted by one node 
won't be shared by any other nodes; we lost the file sharing ability 
which I think is a fundamental requirement for swift.


-Yi





Michael Wilde wrote:
> 
> 
> On 5/7/09 12:54 PM, Tim Freeman wrote:
>> On Thu, 07 May 2009 11:39:40 -0500
>> Yi Zhu <yizhu at cs.uchicago.edu> wrote:
>>
>>> Michael Wilde wrote:
>>>> Very good!
>>>>
>>>> Now, what kind of tests can you do next?
>>> Next, I will try to let swift running on Amazon EC2.
>>>
>>>> Can you exercise the cluster with an interesting workflow?
>>> Yes, Is there any complex sample/tools i can use (rahter than 
>>> first.swift) to test swift performance? Is there any benchmark 
>>> available i can compare with?
>>>
>>>> How large of a cluster can you assemble in a Nimbus workspace ?
>>> Since the vm-image i use to test 'swift' is based on NFS shared file 
>>> system, the performance may not be satisfiable if the we have a large 
>>> scale of cluster. After I got the swift running on Amazon EC2, I will 
>>> try to make a dedicate vm-image by using GPFS or any other shared 
>>> file system you recommended.
>>>
>>>
>>>> Can you aggregate VM's from a few different physical clusters into 
>>>> one Nimbus workspace?
>>> I don't think so. Tim may make commit on it.
>>
>> There is some work going on right now making auto-configuration easier 
>> to do
>> over multiple clusters (that is possible now, it's just very 'manual' and
>> non-ideal unlike with one physical cluster).  You wouldn't really want 
>> to do NFS
>> across a WAN, though.
> 
> Indeed. Now that I think this through more clearly, one workspace == one 
> cluster == one Swift "site", so we could aggregate the resources of 
> multiple workspaces through Swift to execute a multi-site workflow.
> 
> - Mike
> 
>>>
>>>> What's the largest cluster you can assemble with Nimbus?
>>> I am not quite sure,I will do some test onto it soon. since it is a 
>>> EC2-like cloud, it should easily be configured as a cluster with 
>>> hundreds of nodes. Tim may make commit on it.
>>
>> I've heard of EC2 deployments in the 1000s at once, it's up to your 
>> EC2 account
>> limitations (they seem pretty efficient with making your quota 
>> ever-higher).
>> Nimbus installation at Teraport maxes out at 16, there are other 'science
>> clouds' but I don't know their node numbers.  EC2 is the place where 
>> you will
>> be able to really test scaling something.
>>
>> Tim
> 




More information about the Swift-devel mailing list