[AG-TECH] RE : Bridge Registry Timeout

D'ANFRAY Philippe philippe.d-anfray at cea.fr
Thu Jun 10 04:13:43 CDT 2010


Bonjour

We post some messages about our "vanishing bridge" (ARISTOTE) some times ago. 
We have exactly the same issue with UBUNTU 10.4 and found nothing in the logs too

Cordialement

Ph d'Anfray

-------- Message d'origine--------
De: ag-tech-bounces at lists.mcs.anl.gov de la part de John I. Quebedeaux, Jr
Date: mer. 09/06/2010 17:48
À: Matthew Leszczenski; ag-tech at lists.mcs.anl.gov
Objet : Re: [AG-TECH] Bridge Registry Timeout
 
Ahhhhh... I can at least report I¹m seeing the same with the LSU bridge ­ FC
12, 3.2beta. It started happening after our upgrading to FC12 and 3.2beta.

I haven¹t been able to identify what¹s different, I¹ve been comparing logs
between the old and new...

-John



From: Matthew Leszczenski <mxl9499 at rit.edu>
Date: Wed, 09 Jun 2010 11:44:04 -0400
To: <ag-tech at lists.mcs.anl.gov>
Subject: [AG-TECH] Bridge Registry Timeout

Hello all,

My apologies if this has been covered by other people in the past, however I
have spent considerable time searching the archives for instructions on how
to fix this issue (or even exactly where it stems from).

Here at RIT I have been working on setting up a Unicast Bridge, however I
have run into a snag. I have the bridge up and working fine, I consistently
have 2 of our own nodes connected through the bridge at all times that they
are up, so it works as a bridge already. Our problem is that for about 5
minutes the bridge shows up in the registry list as an option for the nodes,
but after that 5 minutes it disappears from the registry list if a registry
purge is used, or if a node logs into AG after the registry timeout happens
it is gone. If the bridge is in the list from those 5 minutes, and the list
is not purged, that node can still connect and disconnect from the bridge
server without a problem, so it is still up and working.

For details, I am running on Fedora 12, using the Bridge python script that
is installed with AG3.2 (it has a created date of 2005/12/06 in case it has
been updated). When running the script I am running it with the following
command:

./Bridge -n "RIT Brooklyn" -l RIT

I have been watching the log file that I have been directing all the output
to, and the beginning I have found an interesting entry, but this is only
when there are no clients connected:

reached inactivity timeout and have no clients; exiting
Traceback (most recent call last):
  File 
"/usr/lib/python2.6/site-packages/AccessGrid3/AccessGrid/AGXMLRPCServer.py",
line 63, in run 
      self.handle_request()
  File "/usr/lib/python2.6/SocketServer.py", line 262, in handle_request
      fd_sets = select.selectP[self], [], [], timeout)
error: (4, 'Interrupted system call')


However when there are clients connected it every so often just prints out
the connection information as follows:

max_unicast_mem is 32
myhostname=brooklyn
myhostipaddress=129.21.x.x

using multicast 
ucport [data]=51390   ucport [rtcp]=51391
mcport [data]=56384  mcport [rtcp]=56385
making multicast port [0]
making multicast port [1]
No bridge.acl file found, no ACL set

If anyone has information that could help me track down where this problem
is, it would be a great help.

Thank you in advance,
   Matthew Leszczenski
-Collaborations Technology Specialist @ RIT Research Computing Department







More information about the ag-tech mailing list