[Swift-devel] Java hangs on new rcc hardware
Michael Wilde
wilde at mcs.anl.gov
Mon Jul 9 08:19:55 CDT 2012
Java is acting strange for me on the new RCC "midway" cluster. The symptom is that the jvm seems to go into a tight cpu loop across several (3 or more) cores.
I see this first in the polling loop in the local scheduler provider, which calls Thread.sleep() and seems to not return. But each time I suspect and resume the jvm with ^Z, fg, ^Z, bg, it progresses further. Doing this twice enables the jvm to successfully complete the Swift script its running (which tests a single PBS job).
I see what appears to be similar behavior in the Swift build. The ant redist will hang somewhere around where Swift compiles the antlr output, then a similar suspect-resume sequence will cause it to continue.
I saw this first with the Java 1.7 that was installed on midway; then with the latest JDK 1.6, and also with what I think is a more recent/latest JDK 1.7.
Im still debugging, but any help or suggestions would be most welcome.
Thanks,
- Mike
--
Michael Wilde
Computation Institute, University of Chicago
Mathematics and Computer Science Division
Argonne National Laboratory
More information about the Swift-devel
mailing list