[AG-TECH] Bridge Traffic/Router Issues

Mike Weaver weaver at ascr.doe.gov
Sat Jul 18 06:23:45 CDT 2009


OK, here's a weird one.  AG 3.2b1 on Fedora 11.  Started the venue client w/
default settings (in particular, unicast mode).  Went to the ANL venue
server lobby.  Had some issues w/ RAT & D-Bus.  Did some investigation and
then got side-tracked with other tasks.  Sometime later I got bumped from
the venue server, but the client was still running.  Sometime later still
(sorry about the uncertainty of the timing) we started noticing timeout
issues on our LAN.  Web browsing would hang at various stages (DNS,
connecting..., transferring...) for 20-30 seconds, every 10-15 minutes.
Sometimes long enough to timeout the connection.  Clicking refresh would
bring the page up fine.  One other possibly relevant detail; I was running
the venue client as root.

Users also started complaining about getting disconnected when remoted in
(RDP & Citrix).  Long story-short, we tracked it down to CPU utilization
spikes on our router (Cisco 7206, FastEthernet PIC).  The CPU utilization
(sometimes as high as 100%) was almost entirely from interrupts, no process
or memory issues.  While investigating NetFlow data, I notice a large number
of flows to the Auckland University bridge server in New Zealand.  I found &
exited the running venue client and the network problems stopped.  No
guarantee that the venue client was to blame, but the problem persisted
steadily for almost a week, and hasn't reoccurred in over 2 days coincident
with exiting the venue client.  Pretty strong circumstantial evidence IMHO.
Looking at the venue client logs, the start of the network issues
corresponded almost exactly with entering the lobby.  There are a number of
errors related to contacting bridges, but most are due to DNS issues as we
block a lot of central & southeast Asian networks (.ru, .kr, .cn, .tw, .hk,
etc...).  If someone is interested (hint hint Tom), I can provide the logs,
but their pretty big due the length of time the client was running so I
didn't want to post them to the list.

Any idea what kind of traffic might be flowing between the venue client and
a bridge in these circumstances?  Anything periodic on the order of every
10-15 minutes?  We're trying to characterize the data and determine why the
router was having issues with it.

I'd appreciate any thoughts or ideas that anyone might have on this.

Mike

--
Mike Weaver
US Department of Energy
ASCR/SC-21.1
Germantown Building
Voice: 301-903-0072
Fax: 301-903-7774
Email: weaver at ascr.doe.gov  

 




More information about the ag-tech mailing list