[Swift-devel] runaway workers on teraport coaster test of
Michael Wilde
wilde at mcs.anl.gov
Sun Feb 8 23:36:31 CST 2009
Im testing coasters with
http://www.ci.uchicago.edu/~benc/tmp/coaster-head-elsewhere
This worked for me once Fri at noon but not since.
I put a .soft entry for java into the ~osg .soft file, to deal with
issues discussed off-list.
I made a few smll changes in bootstrap.sh from that patched version -
some for logging, and one to make the X509_CERT_DIR variable conditional
on whether that directory exists.
The coaster service now starts, but it went into a loop spawning
short-lived workers, and not getting anything done.
I saw dozens of workers start, with about 10-20 or so running at a time.
These logs and all files related to the run are in
~wilde/oops/oopstest/runaway-workers.
In coasters.log I see 50+ messages "WorkerManager No suitable worker
found. Attempting to start a new one."
Ayy thoughts on this? TeraPort has been too saturated this evening to
test any further, but it would be good to have some sense of whats
causing this.
More information about the Swift-devel
mailing list