[AG-DEV] Parallelize some of the functions of the Start-Up forthe Venue Client

Andrew A Rowley Andrew.Rowley at manchester.ac.uk
Wed Oct 4 04:59:48 CDT 2006


Hi,

I just realised that if the bridges are connected peer-to-peer and use umtp where multicast is not available, you could actually make the bridge client another bridge in the peer-to-peer network.  This bridge would use a umtp link to another bridge when multicast is not available and so would bridge the client multicast streams into the meeting.

Andrew :)

============================================
Access Grid Support Centre,
RSS Group,
Manchester Computing,
Kilburn Building,
University of Manchester,
Oxford Road,
Manchester, 
M13 9PL, 
UK
Tel: +44(0)161-275 0685
Email: Andrew.Rowley at manchester.ac.uk 

> -----Original Message-----
> From: Andrew Rowley [mailto:Andrew.Rowley at manchester.ac.uk]
> Sent: 03 October 2006 22:07
> To: Thomas D. Uram
> Cc: Brian Corrie; Jason Bell; ag-dev at mcs.anl.gov
> Subject: Re: [AG-DEV] Parallelize some of the functions of the Start-Up
> forthe Venue Client
> 
> Hi,
> 
> There is one assumption that is being made with this implementation and
> using
> the unicast bridge closest to you.  This assumption is that all of the
> bridges
> can speak to each other in multicast perfectly.  If this is not the case
> and
> two people connect to two bridges that don't speak to each other, they
> will not
> be able to communicate.  If this issue can be resolved, there should not
> be a
> problem with having a single registry (or peer-to-peer connected multiple
> registries).
> 
> One way of doing this would be to have each bridge connect to the other
> bridges
> in a peer-to-peer fashion, with the registries included in the bridges
> themselves.  The bridges can then monitor the multicast traffic between
> themselves with some specially assigned multicast channel (and the
> peer-to-peer
> network to check that traffic is getting through).  If two people connect
> to
> two bridges that can't speak in multicast, the bridges could tunnel to
> each
> other in unicast (e.g. using umtp), creating a fully connected network
> (I think
> this is like VRVS reflectors).  Note that this could be useful for
> temporarily
> poor multicast; transfer of packets can be monitored using the peer-to-
> peer
> network and if the loss becomes large, the bridges can switch to umtp
> transport
> until the loss becomes less again.  Another cool thing that would happen
> here
> is that if all clients connected to their local bridge and unicast was
> being
> used between the bridges, the link between any two bridges would not
> transfer
> any more traffic than if multicast were being used (I don't think).
> 
> Without this method, it is actually better to have each venue having only
> one
> bridge to ensure that all users only join the same bridge and thus ensure
> connectivity, although this will increase load on the bridges.  This is
> the
> reason I was suggesting that the venue server should give out the
> registry url.
> 
> Only once we can guarantee connectivity between unicast bridges can we
> talk
> about having users use their closest bridge from a large list.  Otherwise
> you
> are going to get a lot of new users just selecting "Use Unicast" and
> finding
> that it doesn't work.
> 
> Andrew :)
> 
> Quoting "Thomas D. Uram" <turam at mcs.anl.gov>:
> 
> > This concerns me for a couple reasons:
> >
> > - Which bridges does a new user on a new venue server get?  Perhaps
> none.
> > Or maybe new venue servers default to using the Argonne bridge, but I
> > really dislike hard-coding service URLs into the installers).
> >
> > - If the WestGrid bridges are down, or multicast to them is
> > problematic, is the
> > user just stuck?  They should certainly have the option of using some
> other
> > bridges, too.
> >
> > - This arrangement assumes that the users are "near" to the venue
> server, and
> > would therefore like to use bridges near the venue server.  Every user
> of the
> > Argonne venue server would use the Argonne bridges, even though they
> > might prefer to use their local bridges (especially if they are very
> distant
> > from Argonne).
> >
> > These concerns could be alleviated by:
> >
> > - allowing venue servers to specify a set of bridge networks to query
> > - having the p2p bridge network do some localization relative to the
> > venue clients, so they get bridges reasonably close to them (in network
> > terms)
> > - allowing clients to customize their bridge selection (with the
> > downside that this
> > requires manual intervention by new users).
> >
> > This conversation is very good and helpful; let's keep it up.
> >
> > Tom
> >
> >
> >
> > On 9/8/06 5:54 PM, Brian Corrie wrote:
> >> Hi all,
> >>
> >> I like this idea as well - associating a registry with a venue
> >> server seems to make the most sense to me. Then when one goes to the
> >> Argonne venue server, they use the Argonne associated bridges, but
> >> when they use the WestGrid server they get our registry. This makes
> >> life much easier for the end users as they don't have to change
> >> their settings, the venue server knows where the registry is...
> >>
> >> Gets my vote!
> >>
> >> Brian
> >>
> >>
> >> Andrew A Rowley wrote:
> >>> Hi,
> >>>
> >>> One thought I had would be to have the server give the client the
> >>> registry url.  This would allow people who usually use one venue
> >>> server to use a different venue server, including the associated
> >>> bridges, without having to change their settings. Currently, if I
> >>> decide to put our bridge on our own separate registry, and someone
> >>> that usually uses the vv3 server wants to use our server, they
> >>> would have to manually go in and specify the new bridge registry so
> >>> that they could use our bridge (either through a new UI interface,
> >>> or through editing the file).  This adds an extra complication to
> >>> the process.
> >>>
> >>> Andrew :)
> >>>
> >>> ============================================
> >>> Access Grid Support Centre,
> >>> RSS Group,
> >>> Manchester Computing,
> >>> Kilburn Building,
> >>> University of Manchester,
> >>> Oxford Road,
> >>> Manchester, M13 9PL, UK
> >>> Tel: +44(0)161-275 0685
> >>> Email: Andrew.Rowley at manchester.ac.uk
> >>>> -----Original Message-----
> >>>> From: owner-ag-dev at mcs.anl.gov [mailto:owner-ag-dev at mcs.anl.gov] On
> Behalf
> >>>> Of Thomas D. Uram
> >>>> Sent: 07 September 2006 23:35
> >>>> To: Jason Bell
> >>>> Cc: ag-dev at mcs.anl.gov
> >>>> Subject: Re: [AG-DEV] Parallelize some of the functions of the Start-
> Up
> >>>> for the Venue Client
> >>>>
> >>>>
> >>>> I'll clarify and suggest some things.  Feel free to comment or
> suggest
> >>>> alternatives.
> >>>>
> >>>> The base problem is well understood:  The single universal bridge
> network
> >>>> can be joined by anyone, including people whose bridge machines are
> behind
> >>>> a firewall.  Before 3.0.2, VenueClients pinged bridge machines and
> had a
> >>>> set timeout of one second, so even if this problem had occurred
> before, it
> >>>> was not noticeable (no one reported it, anyway).  As of 3.0.2, 'ping'
> has
> >>>> been swapped out for an RPC call to the bridge, and the timeout was
> >>>> inadvertently not carried over.  I've put together a modified
> >>>> RegistryClient.py
> >>>> that includes a one-second timeout; interested people can look here:
> >>>>
> >>>> http://www.mcs.anl.gov/~turam/ag3/registry/RegistryClient.py
> >>>>
> >>>> This fix should potentially be rolled out right away to overcome the
> >>>> problem with bridges behind firewalls.  I'd be interested to know if
> >>>> people
> >>>> verify this fix, so we can move ahead with it.
> >>>>
> >>>> We shouldn't have to worry about the list of bridges growing to an
> >>>> unscalable number:  VenueClients request a max of 10 bridges
> >>>> from the registry.  This imposes a number of limitations, several of
> >>>> which we'll address in the coming development.
> >>>>
> >>>> Having clients time out is essential.  It has always been our plan to
> have
> >>>> the bridge network supported in more of a p2p style than the current
> >>>> registry.  This would remove the single point of failure that is now
> the
> >>>> registry.  With this p2p model, there would be no central registry to
> >>>> determine whether
> >>>> a bridge is suitable (i.e., not behind a firewall), so allowing
> clients to
> >>>> quickly measure connectivity to a bridge and be able to timeout is
> >>>> required.
> >>>> There's a fair amount of interesting work related to p2p here; if
> someone
> >>>> is knowledgeable and interested, let us know.
> >>>>
> >>>> It has also been our plan to let people run alternate bridge
> networks,
> >>>> and configure venue clients to use that alternate bridge network.
> >>>> Eventually it might make sense for them to also be able to use
> multiple
> >>>> bridge networks (e.g., the overarching AG bridge network, plus the
> >>>> private bridge network established within their institution).
> >>>>
> >>>> I'll try to get our plans written up and sent to the list in the near
> >>>> future,
> >>>> so they're open for comment before and during implementation.
> >>>>
> >>>> Tom
> >>>>
> >>>>
> >>>> On 9/5/06 9:07 AM, Jason Bell wrote:
> >>>>> G'day all
> >>>>>
> >>>>> I think I should mention that while testing my own AG 3 Bridge, I
> was
> >>>>> one of those "baddies" as well.
> >>>>>
> >>>>> What this highlighted is how easily it would be for someone to
> "simply"
> >>>>> create a "baddy" bridge without realising it.
> >>>>>
> >>>>> The purpose for my testing was to add additional documentation on my
> >>>>> install guide on how to configure a unicast bridge and Venue Server
> for
> >>>>> AG 3.  I am very reluctant to release any documentation that shows
> how
> >>>>> to configure a Bridge as it may inadvertently create more "baddies",
> >>>>> thus causing the AG3 VenueClient to be almost unusable due to the
> long
> >>>>> start-up time.
> >>>>>
> >>>>> Anyway, some constructive comments in my opinion on ways we could
> >>>>> possibly improve this would be:
> >>>>>
> >>>>> *    Having the Load_Bridge() run/execute as a separate
> >>>>> process/thread in which would operate independent of the starting up
> of
> >>>>> the VenueClient itself.  The benefits of this could be:
> >>>>>     -    Running as a separate process shouldn't affect the
> >>>>> performance of starting the Venue Client, etc.
> >>>>>     -    Also, you could re-run this function separately which
> >>>>> would continually update the latest list of bridges.  As I have
> found
> >>>>> recently that the list of bridges is only as accurate to when the
> >>>>> VenueClient was first started.
> >>>>>
> >>>>> *    Another idea, based upon Rhys's suggestion is that if something
> >>>>> doesn't respond after a short delay (rather than a time out), then
> don't
> >>>>> list the bridge.  This would then only list bridges that would be
> >>>>> acceptable to use.  The downside of this is that the list of bridges
> >>>>> "WILL" grow and still cause some delay, though not as long.
> >>>>>
> >>>>> *    Based upon Andrew's suggestion, having the ability for a Bridge
> >>>>> to Register to different registries, would allow (I think) the
> bridges
> >>>>> to be assigned to various regions.  That way, a registry could list
> >>>>> "Good" unicast bridges for the various regions, cutting down the
> number
> >>>>> of bridges tested and loaded.
> >>>>>
> >>>>>  I honestly think that the possible solution to this problem is most
> >>>>> likely a combination of some of the suggestions above, and possibly
> some
> >>>>> other ideas.
> >>>>>
> >>>>> Anyway, I think this is a very important issue and hopefully we may
> be
> >>>>> able to come up with some real "fixes" to the problem.
> >>>>>
> >>>>> Cheers,
> >>>>> Jason.
> >>>>>
> >>>>>
> >>>>> -----Original Message-----
> >>>>> From: Christoph Willing [mailto:willing at vislab.uq.edu.au]
> >>>>> Sent: Tuesday, 5 September 2006 4:54 PM
> >>>>> To: Rhys Hawkins
> >>>>> Cc: Jason Bell; ag-dev at mcs.anl.gov
> >>>>> Subject: Re: [AG-DEV] Parallelize some of the functions of the
> Start-Up
> >>>>> for the Venue Client
> >>>>>
> >>>>>
> >>>>> On 04/09/2006, at 3:45 PM, Rhys Hawkins wrote:
> >>>>>
> >>>>> [snip]
> >>>>>
> >>>>>> I've commented out line 125 in AccessGrid/Registry/
> >>>>>> RegistryClient.py, ie:
> >>>>>>     #self.bridges = self._sortBridges(maxToReturn)
> >>>>>> I don't know whether this will help your colleague or not. It
> >>>>>> certainly
> >>>>>> makes things quicker for me. If you're just testing a local bridge
> >>>>>> then
> >>>>>> you could just stick you local bridge description in the beginning
> >>>>>> of the
> >>>>>> list to fake the sort. Although this is just hacking for testing
> >>>>>> purposes
> >>>>>> and doesn't solve the actual problem.
> >>>>>>
> >>>>>
> >>>>> I was doing something similar (VenueClient.py at line 199, comment
> >>>>> out self.__LoadBridges()), which is fine if you don't need bridges.
> >>>>> Today I needed to connect to a site whose only visibility was
> through
> >>>>> a bridge they were running, so I had to reinstate them, but block
> the
> >>>>> baddies. I now have a list of offending bridges inserted at the
> >>>>> beginning of the PingBridgeService() definition (line 63) in
> >>>>> RegistryClient.py. These sites are skipped as follows:
> >>>>>
> >>>>> def PingBridgeService(self, bridgeProxy):
> >>>>>      banned = ['some.site', 'another.site']
> >>>>>      for b in banned:
> >>>>>          if bridgeProxy._ServerProxy__host.startswith(b):
> >>>>>          return -1
> >>>>>      self.log.info('PingBridgeService: trying %s' %
> >>>>> bridgeProxy._ServerProxy__host)
> >>>>>      etc. as before
> >>>>>
> >>>>> The extra logging line shows progress through the bridges a bit
> >>>>> better and identifies new baddies.
> >>>>>
> >>>>>
> >>>>> Of course its cumbersome to edit RegistryClient.py everytime a new
> >>>>> baddy is detected (there have been a few recently), but I generally
> >>>>> have a fast start up now, as well as having access to "good"
> bridges.
> >>>>> Maybe a separate configuration file containing the baddies would be
> >>>>> better; the VenueClient could consult it at startup before
> processing
> >>>>> the bridge list.
> >>>>>
> >>>>>
> >>>>> chris
> >>>>>
> >>>>>
> >>>>> Christoph Willing                       +61 7 3365 8350
> >>>>> QCiF/QPSF Access Grid Manager
> >>>>> University of Queensland
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>
> >>
> >>
> >
> 
> 
> 




More information about the ag-dev mailing list