Stabilizing Text service
Robert Olson
olson at mcs.anl.gov
Mon Jul 14 15:46:17 CDT 2003
I think the current fd problems -- the crash of this morning was a fd table
overflow:
(gdb) where
#0 0x402f02f2 in globus_l_io_table_add (handle=0xad4c4c0) at
globus_io_core.c:505
#1 0x402f0487 in globus_i_io_register_read_func (handle=0xad4c4c0,
callback_func=0x402fbe64 <globus_l_io_read_auth_token>,
callback_arg=0xacd0578, arg_destructor=0, register_select=1) at
globus_io_core.c:627
#2 0x402f9251 in globus_i_io_securesocket_register_accept (handle=0xad4c4c0,
callback_func=0x402efec0 <globus_i_io_accept_callback>,
callback_arg=0xa9fafe8) at globus_io_securesocket.c:391
#3 0x402ff217 in globus_io_tcp_register_accept (listener_handle=0x83bdc78,
attr=0x83ad668, new_handle=0xad4c4c0,
callback=0x402efc94 <globus_i_io_monitor_callback>,
callback_arg=0xbe3fecec) at globus_io_tcp.c:925
#4 0x402ff3fd in globus_io_tcp_accept (listener_handle=0x83bdc78,
attr=0x83ad668, handle=0xad4c4c0) at globus_io_tcp.c:1028
#5 0x402d43ea in tcp_accept (listenerHandle=0x83bdc78, attr=0x83ad668) at
src/io_wrap.c:2371
#6 0x402dc31c in _wrap_tcp_accept (self=0x0, args=0x8e049f4) at
src/io_wrap.c:4188
#7 0x080cb709 in PyCFunction_Call ()
[snip]
(gdb) fr 0
#0 0x402f02f2 in globus_l_io_table_add (handle=0xad4c4c0) at
globus_io_core.c:505
505 globus_l_io_fd_table[handle->fd]->handle = handle;
(gdb) list
500 * ("globus_l_io_table_add()\n"));
501 */
502
503 if (globus_l_io_fd_table[handle->fd])
504 {
505 globus_l_io_fd_table[handle->fd]->handle = handle;
506
507 goto fn_exit;
508 }
509 select_info = (globus_io_select_info_t *)
(gdb) p handle->fd
$3 = 257
(gdb) p globus_l_io_fd_tablesize
$4 = 256
with the TVS are due to the text service not yet having the sanity fixes
applied to it that the event service. Note all the CLOSE_WAIT sockets
lingering from the text service:
[root at vv2 bin]# /usr/sbin/lsof -i tcp:9006 -a -p 29581 |sort +7
python2 29581 ag 11u IPv4 184475379 TCP *:9006 (LISTEN)
COMMAND PID USER FD TYPE DEVICE SIZE NODE NAME
python2 29581 ag 30u IPv4 184546394 TCP
vv2.mcs.anl.gov:9006->131.193.77.118:1459 (CLOSE_WAIT)
python2 29581 ag 35u IPv4 184546508 TCP
vv2.mcs.anl.gov:9006->131.193.77.118:1499 (CLOSE_WAIT)
python2 29581 ag 36u IPv4 184546521 TCP
vv2.mcs.anl.gov:9006->131.193.77.118:1538 (CLOSE_WAIT)
python2 29581 ag 50u IPv4 184583926 TCP
vv2.mcs.anl.gov:9006->131.193.77.118:1774 (ESTABLISHED)
python2 29581 ag 41u IPv4 184546644 TCP
vv2.mcs.anl.gov:9006->aero.east.isi.edu:32781 (CLOSE_WAIT)
python2 29581 ag 42u IPv4 184546731 TCP
vv2.mcs.anl.gov:9006->aero.east.isi.edu:32800 (CLOSE_WAIT)
python2 29581 ag 49u IPv4 184547014 TCP
vv2.mcs.anl.gov:9006->aero.east.isi.edu:32937 (CLOSE_WAIT)
python2 29581 ag 48u IPv4 184554088 TCP
vv2.mcs.anl.gov:9006->aero.east.isi.edu:32957 (CLOSE_WAIT)
python2 29581 ag 51u IPv4 184579268 TCP
vv2.mcs.anl.gov:9006->aero.east.isi.edu:32974 (ESTABLISHED)
python2 29581 ag 44u IPv4 184546811 TCP
vv2.mcs.anl.gov:9006->aglaptop.arsc.edu:1711 (CLOSE_WAIT)
python2 29581 ag 45u IPv4 184546832 TCP
vv2.mcs.anl.gov:9006->aglaptop.arsc.edu:1738 (ESTABLISHED)
python2 29581 ag 13u IPv4 184475660 TCP
vv2.mcs.anl.gov:9006->colomb.trace.wisc.edu:4317 (CLOSE_WAIT)
python2 29581 ag 14u IPv4 184475678 TCP
vv2.mcs.anl.gov:9006->colomb.trace.wisc.edu:4345 (CLOSE_WAIT)
python2 29581 ag 15u IPv4 184475704 TCP
vv2.mcs.anl.gov:9006->colomb.trace.wisc.edu:4370 (CLOSE_WAIT)
python2 29581 ag 16u IPv4 184485219 TCP
vv2.mcs.anl.gov:9006->colomb.trace.wisc.edu:4395 (CLOSE_WAIT)
python2 29581 ag 17u IPv4 184486808 TCP
vv2.mcs.anl.gov:9006->colomb.trace.wisc.edu:4420 (CLOSE_WAIT)
python2 29581 ag 22u IPv4 184546116 TCP
vv2.mcs.anl.gov:9006->colomb.trace.wisc.edu:4537 (CLOSE_WAIT)
python2 29581 ag 24u IPv4 184546160 TCP
vv2.mcs.anl.gov:9006->colomb.trace.wisc.edu:4570 (CLOSE_WAIT)
python2 29581 ag 25u IPv4 184546172 TCP
vv2.mcs.anl.gov:9006->colomb.trace.wisc.edu:4602 (CLOSE_WAIT)
python2 29581 ag 26u IPv4 184546190 TCP
vv2.mcs.anl.gov:9006->colomb.trace.wisc.edu:4634 (CLOSE_WAIT)
python2 29581 ag 19u IPv4 184546024 TCP
vv2.mcs.anl.gov:9006->deng.cs.uiuc.edu:2931 (ESTABLISHED)
python2 29581 ag 28u IPv4 184546383 TCP
vv2.mcs.anl.gov:9006->dhcp-7.ccr.buffalo.edu:1231 (CLOSE_WAIT)
python2 29581 ag 31u IPv4 184546413 TCP
vv2.mcs.anl.gov:9006->dhcp-7.ccr.buffalo.edu:1302 (CLOSE_WAIT)
python2 29581 ag 32u IPv4 184546446 TCP
vv2.mcs.anl.gov:9006->dhcp-7.ccr.buffalo.edu:1334 (CLOSE_WAIT)
python2 29581 ag 33u IPv4 184546460 TCP
vv2.mcs.anl.gov:9006->dhcp-7.ccr.buffalo.edu:1366 (CLOSE_WAIT)
python2 29581 ag 38u IPv4 184546561 TCP
vv2.mcs.anl.gov:9006->dhcp-7.ccr.buffalo.edu:1472 (CLOSE_WAIT)
python2 29581 ag 39u IPv4 184546574 TCP
vv2.mcs.anl.gov:9006->dhcp-7.ccr.buffalo.edu:1504 (CLOSE_WAIT)
A solution is to modify the text service to use the AsynchIO object that is
in the asynch branch. The transformation should be fairly straightforward;
if you look at EventServiceAsynch2.py in the asynch branch you'll find a
version of the event service to which that tranformation has been applied.
It would probably be a good thing to work that code into the release code
for both the text and event services.
--bob
More information about the ag-dev
mailing list