[Swift-devel] Swift-issues (PBS+NFS Cluster)

Michael Wilde wilde at mcs.anl.gov
Tue May 12 11:32:37 CDT 2009


Maybe relevant - there's a FUSE filesystem to mount an S3 bucket as a 
filesystem:

http://code.google.com/p/s3fs/wiki/FuseOverAmazon

- Mike


On 5/12/09 11:19 AM, Ioan Raicu wrote:
> Hi,
> 
> yizhu wrote:
>> Hi,
>>
>> I've now got the swift system running on Amazon EC2 with swift 
>> installed on head-node of the cluster. I will soon let it the user 
>> submit the job from its own  machine after i solve the authentication 
>> issues of Globus.
>>
>>
>> I think my next step is to write a sample swift code to check if we 
>> can let swift grab input files from S3, execute it, and then write the 
>> output files back to S3.
>>
>>
>> Since each Amazon virtual node only has limited storage space( Small 
>> instance for 1.7GB, Large Instance for 7.5GB), 
> Are you sure? That sounds like like the amount of RAM each instance 
> gets. The last time I used EC2 (more than a year ago), each instance had 
> a disk of 100GB+ each, which could be treated as a scratch disk that 
> would be lost when the instance was powered down.
>> we may need to use EBS(Elastic Block Store) to storage temp files 
>> created by swift. The EBS behaved like a hard disk volume and can be 
>> mounted by any virtual nodes. But here arise a problem, since a volume 
>> can only be attached to one instance at a time[1], so the files store 
>> in EBS mounted by one node won't be shared by any other nodes; 
> The EBS sounds like the same thing as the local disk, except that it is 
> persisted in S3, and can be recovered when the instance is started again 
> later.
> 
> You can use these local disks, or EBS disks, to create a shared/parallel 
> file system, or manage them yourself.
>> we lost the file sharing ability which I think is a fundamental 
>> requirement for swift.
> The last time I worked with the Workspace Service (now part of Nimbus), 
> we were able to use NFS to create a shared file system across our 
> virtual cluster. This allowed us to run Swift without modifications on 
> our virtual cluster. Some of the more recent work on Swift might have 
> eased up some of the requirements for a shared file system, so you might 
> be able to run Swift without a shared file system, if you configure 
> Swift just right. Mike, is this true, or we aren't quite there yet?
> 
> Ioan
>>
>>
>> -Yi
>>
>>
>>
>>
>>
>> Michael Wilde wrote:
>>>
>>>
>>> On 5/7/09 12:54 PM, Tim Freeman wrote:
>>>> On Thu, 07 May 2009 11:39:40 -0500
>>>> Yi Zhu <yizhu at cs.uchicago.edu> wrote:
>>>>
>>>>> Michael Wilde wrote:
>>>>>> Very good!
>>>>>>
>>>>>> Now, what kind of tests can you do next?
>>>>> Next, I will try to let swift running on Amazon EC2.
>>>>>
>>>>>> Can you exercise the cluster with an interesting workflow?
>>>>> Yes, Is there any complex sample/tools i can use (rahter than 
>>>>> first.swift) to test swift performance? Is there any benchmark 
>>>>> available i can compare with?
>>>>>
>>>>>> How large of a cluster can you assemble in a Nimbus workspace ?
>>>>> Since the vm-image i use to test 'swift' is based on NFS shared 
>>>>> file system, the performance may not be satisfiable if the we have 
>>>>> a large scale of cluster. After I got the swift running on Amazon 
>>>>> EC2, I will try to make a dedicate vm-image by using GPFS or any 
>>>>> other shared file system you recommended.
>>>>>
>>>>>
>>>>>> Can you aggregate VM's from a few different physical clusters into 
>>>>>> one Nimbus workspace?
>>>>> I don't think so. Tim may make commit on it.
>>>>
>>>> There is some work going on right now making auto-configuration 
>>>> easier to do
>>>> over multiple clusters (that is possible now, it's just very 
>>>> 'manual' and
>>>> non-ideal unlike with one physical cluster).  You wouldn't really 
>>>> want to do NFS
>>>> across a WAN, though.
>>>
>>> Indeed. Now that I think this through more clearly, one workspace == 
>>> one cluster == one Swift "site", so we could aggregate the resources 
>>> of multiple workspaces through Swift to execute a multi-site workflow.
>>>
>>> - Mike
>>>
>>>>>
>>>>>> What's the largest cluster you can assemble with Nimbus?
>>>>> I am not quite sure,I will do some test onto it soon. since it is a 
>>>>> EC2-like cloud, it should easily be configured as a cluster with 
>>>>> hundreds of nodes. Tim may make commit on it.
>>>>
>>>> I've heard of EC2 deployments in the 1000s at once, it's up to your 
>>>> EC2 account
>>>> limitations (they seem pretty efficient with making your quota 
>>>> ever-higher).
>>>> Nimbus installation at Teraport maxes out at 16, there are other 
>>>> 'science
>>>> clouds' but I don't know their node numbers.  EC2 is the place where 
>>>> you will
>>>> be able to really test scaling something.
>>>>
>>>> Tim
>>>
>>
>>
> 



More information about the Swift-devel mailing list