[Swift-devel] Re: transfer-only workload worked!! (was Re: resuming discussion on the hung processes...)
Allan Espinosa
aespinosa at cs.uchicago.edu
Tue Apr 26 15:45:10 CDT 2011
Hi Mihael,
This is on the latest stable branch. Here's the dump:
2011-04-25 11:45:35
Full thread dump Java HotSpot(TM) 64-Bit Server VM (17.0-b16 mixed mode):
"Attach Listener" daemon prio=10 tid=0x0000000044cd2800 nid=0x4c5f
waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
Locked ownable synchronizers:
- None
"Condor provider queue poller" daemon prio=10 tid=0x00002aabb86eb800
nid=0x3c7a sleeping[0x0000000043c1f000]
java.lang.Thread.State: TIMED_WAITING (sleeping)
at java.lang.Thread.sleep(Native Method)
at org.globus.cog.abstraction.impl.scheduler.common.AbstractQueuePoller.run(AbstractQueuePoller.java:76)
at java.lang.Thread.run(Thread.java:619)
Locked ownable synchronizers:
- None
"Scheduler" prio=10 tid=0x00002aabb8763800 nid=0x34c0 in Object.wait()
[0x0000000041678000]
java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at org.globus.cog.karajan.scheduler.LateBindingScheduler.sleep(LateBindingScheduler.java:305)
at org.globus.cog.karajan.scheduler.LateBindingScheduler.run(LateBindingScheduler.java:258)
- locked <0x00002aaab7ca50a0> (a org.griphyn.vdl.karajan.VDSAdaptiveScheduler)
Locked ownable synchronizers:
- None
"Progress ticker" daemon prio=10 tid=0x00002aabb86d5000 nid=0x2c3f
waiting on condition [0x0000000041577000]
java.lang.Thread.State: TIMED_WAITING (sleeping)
at java.lang.Thread.sleep(Native Method)
at org.griphyn.vdl.karajan.lib.RuntimeStats$ProgressTicker.run(RuntimeStats.java:137)
Locked ownable synchronizers:
- None
"Restart Log Sync" daemon prio=10 tid=0x0000000044f15800 nid=0x2c38 in
Object.wait() [0x000000004290c000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x00002aaab7c0c808> (a
org.globus.cog.karajan.workflow.nodes.restartLog.SyncThread)
at java.lang.Object.wait(Object.java:485)
at org.globus.cog.karajan.workflow.nodes.restartLog.SyncThread.run(SyncThread.java:45)
- locked <0x00002aaab7c0c808> (a
org.globus.cog.karajan.workflow.nodes.restartLog.SyncThread)
Locked ownable synchronizers:
- None
"Overloaded Host Monitor" daemon prio=10 tid=0x00002aabb857b800
nid=0x2c33 sleeping[0x000000004280b000]
java.lang.Thread.State: TIMED_WAITING (sleeping)
at java.lang.Thread.sleep(Native Method)
at org.globus.cog.karajan.scheduler.OverloadedHostMonitor.run(OverloadedHostMonitor.java:47)
Locked ownable synchronizers:
- None
"Timer-0" daemon prio=10 tid=0x00000000451a0000 nid=0x2c32 in
Object.wait() [0x000000004270a000]
java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at java.util.TimerThread.mainLoop(Timer.java:509)
- locked <0x00002aaab7d01f10> (a java.util.TaskQueue)
at java.util.TimerThread.run(Timer.java:462)
Locked ownable synchronizers:
- None
"pool-1-thread-4" prio=10 tid=0x00000000452d9800 nid=0x2c17 in
Object.wait() [0x0000000042508000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x00002aaab3b8df68> (a
edu.emory.mathcs.backport.java.util.concurrent.LinkedBlockingQueue$SerializableLock)
at java.lang.Object.wait(Object.java:485)
at edu.emory.mathcs.backport.java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:315)
- locked <0x00002aaab3b8df68> (a
edu.emory.mathcs.backport.java.util.concurrent.LinkedBlockingQueue$SerializableLock)
at edu.emory.mathcs.backport.java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:470)
at edu.emory.mathcs.backport.java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:667)
at java.lang.Thread.run(Thread.java:619)
Locked ownable synchronizers:
- None
"pool-1-thread-3" prio=10 tid=0x0000000044ffc800 nid=0x2c16 in
Object.wait() [0x0000000042407000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x00002aaab3b8df68> (a
edu.emory.mathcs.backport.java.util.concurrent.LinkedBlockingQueue$SerializableLock)
at java.lang.Object.wait(Object.java:485)
at edu.emory.mathcs.backport.java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:315)
- locked <0x00002aaab3b8df68> (a
edu.emory.mathcs.backport.java.util.concurrent.LinkedBlockingQueue$SerializableLock)
at edu.emory.mathcs.backport.java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:470)
at edu.emory.mathcs.backport.java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:667)
at java.lang.Thread.run(Thread.java:619)
Locked ownable synchronizers:
- None
"pool-1-thread-2" prio=10 tid=0x00002aabc024a800 nid=0x2c15 in
Object.wait() [0x0000000042306000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x00002aaab3b8df68> (a
edu.emory.mathcs.backport.java.util.concurrent.LinkedBlockingQueue$SerializableLock)
at java.lang.Object.wait(Object.java:485)
at edu.emory.mathcs.backport.java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:315)
- locked <0x00002aaab3b8df68> (a
edu.emory.mathcs.backport.java.util.concurrent.LinkedBlockingQueue$SerializableLock)
at edu.emory.mathcs.backport.java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:470)
at edu.emory.mathcs.backport.java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:667)
at java.lang.Thread.run(Thread.java:619)
Locked ownable synchronizers:
- None
"pool-1-thread-1" prio=10 tid=0x00002aabb85f4000 nid=0x2c14 in
Object.wait() [0x0000000042205000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x00002aaab3b8df68> (a
edu.emory.mathcs.backport.java.util.concurrent.LinkedBlockingQueue$SerializableLock)
at java.lang.Object.wait(Object.java:485)
at edu.emory.mathcs.backport.java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:315)
- locked <0x00002aaab3b8df68> (a
edu.emory.mathcs.backport.java.util.concurrent.LinkedBlockingQueue$SerializableLock)
at edu.emory.mathcs.backport.java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:470)
at edu.emory.mathcs.backport.java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:667)
at java.lang.Thread.run(Thread.java:619)
Locked ownable synchronizers:
- None
"Low Memory Detector" daemon prio=10 tid=0x0000000044c72000 nid=0x2c12
runnable [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
Locked ownable synchronizers:
- None
"CompilerThread1" daemon prio=10 tid=0x0000000044c70000 nid=0x2c11
waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
Locked ownable synchronizers:
- None
"CompilerThread0" daemon prio=10 tid=0x0000000044c6a800 nid=0x2c10
waiting on condition [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
Locked ownable synchronizers:
- None
"Signal Dispatcher" daemon prio=10 tid=0x0000000044c68800 nid=0x2c0f
runnable [0x0000000000000000]
java.lang.Thread.State: RUNNABLE
Locked ownable synchronizers:
- None
"Finalizer" daemon prio=10 tid=0x0000000044c44000 nid=0x2c0e in
Object.wait() [0x0000000041acb000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x00002aaab4ec8920> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:118)
- locked <0x00002aaab4ec8920> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:134)
at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:159)
Locked ownable synchronizers:
- None
"Reference Handler" daemon prio=10 tid=0x0000000044c42000 nid=0x2c0d
in Object.wait() [0x000000004039b000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x00002aaab4ec88a8> (a java.lang.ref.Reference$Lock)
at java.lang.Object.wait(Object.java:485)
at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:116)
- locked <0x00002aaab4ec88a8> (a java.lang.ref.Reference$Lock)
Locked ownable synchronizers:
- None
"main" prio=10 tid=0x0000000044be0000 nid=0x2c07 in Object.wait()
[0x0000000040977000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x00002aaab76845f0> (a
org.griphyn.vdl.karajan.VDL2ExecutionContext)
at java.lang.Object.wait(Object.java:485)
at org.globus.cog.karajan.workflow.ExecutionContext.waitFor(ExecutionContext.java:261)
- locked <0x00002aaab76845f0> (a org.griphyn.vdl.karajan.VDL2ExecutionContext)
at org.griphyn.vdl.karajan.Loader.main(Loader.java:197)
Locked ownable synchronizers:
- None
"VM Thread" prio=10 tid=0x0000000044c3d800 nid=0x2c0c runnable
"GC task thread#0 (ParallelGC)" prio=10 tid=0x0000000044bf3000
nid=0x2c08 runnable
"GC task thread#1 (ParallelGC)" prio=10 tid=0x0000000044bf5000
nid=0x2c09 runnable
"GC task thread#2 (ParallelGC)" prio=10 tid=0x0000000044bf7000
nid=0x2c0a runnable
"GC task thread#3 (ParallelGC)" prio=10 tid=0x0000000044bf8800
nid=0x2c0b runnable
"VM Periodic Task Thread" prio=10 tid=0x0000000044c7d000 nid=0x2c13
waiting on condition
JNI global references: 1093
Here's the last few lines of the resumefile:
...
...
3-199:peak.36!gsiftp://gridftp.ranger.tacc.teragrid.org//scratch/01035/tg802895/science/cybershake/Results/TEST/219/206/PeakVals_TEST_219_206_36.bsa
13-199:peak.33!gsiftp://gridftp.ranger.tacc.teragrid.org//scratch/01035/tg802895/science/cybershake/Results/TEST/219/206/PeakVals_TEST_219_206_33.bsa
13-199:peak.34!gsiftp://gridftp.ranger.tacc.teragrid.org//scratch/01035/tg802895/science/cybershake/Results/TEST/219/206/PeakVals_TEST_219_206_34.bsa
13-199:peak.39!gsiftp://gridftp.ranger.tacc.teragrid.org//scratch/01035/tg802895/science/cybershake/Results/TEST/219/206/PeakVals_TEST_219_206_39.bsa
13-199:peak.37
2011/4/26 Mihael Hategan <hategan at mcs.anl.gov>:
> On Tue, 2011-04-26 at 15:31 -0500, Allan Espinosa wrote:
>
>> > - does it run repeatedly without any user-visible errors?
>>
>> There's this problem where Swift is waiting to finish writing to the
>> resume file. But that's another issue that I would like to defer for
>> now.
>
> Can you send me a stack dump of that situation?
More information about the Swift-devel
mailing list