[Swift-devel] Swift-issues (PBS+NFS Cluster)

Tue May 12 11:53:29 CDT 2009

On Tue, 12 May 2009 11:42:00 -0500
yizhu <yizhu at cs.uchicago.edu> wrote:

> That's great. It makes file management between EC2 and S3 much easier, I 
> will definitely check it out.
> 

Note that every I/O op to S3 will be significantly slower than EBS (and
probably also slower than the local scratch disk).  Add in FUSE (userspace) and
that only compounds the problem.  So if you use this option I would suggest
taking measurements to make sure it is acceptable for the task at hand.

Tim

> Michael Wilde wrote:
> > Maybe relevant - there's a FUSE filesystem to mount an S3 bucket as a 
> > filesystem:
> > 
> > http://code.google.com/p/s3fs/wiki/FuseOverAmazon
> > 
> > - Mike
> > 
> > 
> > On 5/12/09 11:19 AM, Ioan Raicu wrote:
> >> Hi,
> >>
> >> yizhu wrote:
> >>> Hi,
> >>>
> >>> I've now got the swift system running on Amazon EC2 with swift 
> >>> installed on head-node of the cluster. I will soon let it the user 
> >>> submit the job from its own  machine after i solve the authentication 
> >>> issues of Globus.
> >>>
> >>>
> >>> I think my next step is to write a sample swift code to check if we 
> >>> can let swift grab input files from S3, execute it, and then write 
> >>> the output files back to S3.
> >>>
> >>>
> >>> Since each Amazon virtual node only has limited storage space( Small 
> >>> instance for 1.7GB, Large Instance for 7.5GB), 
> >> Are you sure? That sounds like like the amount of RAM each instance 
> >> gets. The last time I used EC2 (more than a year ago), each instance 
> >> had a disk of 100GB+ each, which could be treated as a scratch disk 
> >> that would be lost when the instance was powered down.
> >>> we may need to use EBS(Elastic Block Store) to storage temp files 
> >>> created by swift. The EBS behaved like a hard disk volume and can be 
> >>> mounted by any virtual nodes. But here arise a problem, since a 
> >>> volume can only be attached to one instance at a time[1], so the 
> >>> files store in EBS mounted by one node won't be shared by any other 
> >>> nodes; 
> >> The EBS sounds like the same thing as the local disk, except that it 
> >> is persisted in S3, and can be recovered when the instance is started 
> >> again later.
> >>
> >> You can use these local disks, or EBS disks, to create a 
> >> shared/parallel file system, or manage them yourself.
> >>> we lost the file sharing ability which I think is a fundamental 
> >>> requirement for swift.
> >> The last time I worked with the Workspace Service (now part of 
> >> Nimbus), we were able to use NFS to create a shared file system across 
> >> our virtual cluster. This allowed us to run Swift without 
> >> modifications on our virtual cluster. Some of the more recent work on 
> >> Swift might have eased up some of the requirements for a shared file 
> >> system, so you might be able to run Swift without a shared file 
> >> system, if you configure Swift just right. Mike, is this true, or we 
> >> aren't quite there yet?
> >>
> >> Ioan
> >>>
> >>>
> >>> -Yi
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> Michael Wilde wrote:
> >>>>
> >>>>
> >>>> On 5/7/09 12:54 PM, Tim Freeman wrote:
> >>>>> On Thu, 07 May 2009 11:39:40 -0500
> >>>>> Yi Zhu <yizhu at cs.uchicago.edu> wrote:
> >>>>>
> >>>>>> Michael Wilde wrote:
> >>>>>>> Very good!
> >>>>>>>
> >>>>>>> Now, what kind of tests can you do next?
> >>>>>> Next, I will try to let swift running on Amazon EC2.
> >>>>>>
> >>>>>>> Can you exercise the cluster with an interesting workflow?
> >>>>>> Yes, Is there any complex sample/tools i can use (rahter than 
> >>>>>> first.swift) to test swift performance? Is there any benchmark 
> >>>>>> available i can compare with?
> >>>>>>
> >>>>>>> How large of a cluster can you assemble in a Nimbus workspace ?
> >>>>>> Since the vm-image i use to test 'swift' is based on NFS shared 
> >>>>>> file system, the performance may not be satisfiable if the we have 
> >>>>>> a large scale of cluster. After I got the swift running on Amazon 
> >>>>>> EC2, I will try to make a dedicate vm-image by using GPFS or any 
> >>>>>> other shared file system you recommended.
> >>>>>>
> >>>>>>
> >>>>>>> Can you aggregate VM's from a few different physical clusters 
> >>>>>>> into one Nimbus workspace?
> >>>>>> I don't think so. Tim may make commit on it.
> >>>>>
> >>>>> There is some work going on right now making auto-configuration 
> >>>>> easier to do
> >>>>> over multiple clusters (that is possible now, it's just very 
> >>>>> 'manual' and
> >>>>> non-ideal unlike with one physical cluster).  You wouldn't really 
> >>>>> want to do NFS
> >>>>> across a WAN, though.
> >>>>
> >>>> Indeed. Now that I think this through more clearly, one workspace == 
> >>>> one cluster == one Swift "site", so we could aggregate the resources 
> >>>> of multiple workspaces through Swift to execute a multi-site workflow.
> >>>>
> >>>> - Mike
> >>>>
> >>>>>>
> >>>>>>> What's the largest cluster you can assemble with Nimbus?
> >>>>>> I am not quite sure,I will do some test onto it soon. since it is 
> >>>>>> a EC2-like cloud, it should easily be configured as a cluster with 
> >>>>>> hundreds of nodes. Tim may make commit on it.
> >>>>>
> >>>>> I've heard of EC2 deployments in the 1000s at once, it's up to your 
> >>>>> EC2 account
> >>>>> limitations (they seem pretty efficient with making your quota 
> >>>>> ever-higher).
> >>>>> Nimbus installation at Teraport maxes out at 16, there are other 
> >>>>> 'science
> >>>>> clouds' but I don't know their node numbers.  EC2 is the place 
> >>>>> where you will
> >>>>> be able to really test scaling something.
> >>>>>
> >>>>> Tim
> >>>>
> >>>
> >>>
> >>
> > 
>