Fwd: [Swift-devel] Swift hang

Jonathan Monette jon.monette at gmail.com
Sat Jan 8 14:11:37 CST 2011


Yea.  I am not sure what is going on.  I don't know what ci machine you 
are logged into but login.pads has been slow for me.

On 1/8/11 2:09 PM, Mihael Hategan wrote:
> I'd check the logs, but:
> [hategan at login ~]$ cd
> ~jonmon/Workspace/Swift/Montage/m101_j_6x6/run.0001
> [blinking cursor for 30 minutes now]
>
>
> On Sat, 2011-01-08 at 14:02 -0600, Jonathan Monette wrote:
>> I did not in the thread dump.  The log showed that the files had been
>> staged in but the coaster queue was empty.  I assumed this meant that
>> Swift was hung since coasters had no jobs to run.  After seeing the
>> thread dump though i saw this did not seem to be the case.
>>
>> On 1/8/11 1:44 PM, Mihael Hategan wrote:
>>> I don't see a deadlock in the thread dump. Do you?
>>>
>>> On Wed, 2011-01-05 at 14:42 -0600, Allan Espinosa wrote:
>>>> forgot to include the listhost in the earlier thread.
>>>>
>>>>
>>>> ---------- Forwarded message ----------
>>>> From: Jonathan Monette<jon.monette at gmail.com>
>>>> Date: 2011/1/5
>>>> Subject: Re: [Swift-devel] Swift hang
>>>> To: Allan Espinosa<aespinosa at cs.uchicago.edu>
>>>>
>>>>
>>>> Here is the jstack track
>>>>
>>>> --(14:29:%)-- jstack -l 10232
>>>>
>>>>                --(Wed,Jan05)--
>>>> 2011-01-05 14:29:28
>>>> Full thread dump Java HotSpot(TM) 64-Bit Server VM (17.1-b03 mixed mode):
>>>>
>>>> "Attach Listener" daemon prio=10 tid=0x0000000048490800 nid=0x3d25
>>>> waiting on condition [0x0000000000000000]
>>>>      java.lang.Thread.State: RUNNABLE
>>>>
>>>>      Locked ownable synchronizers:
>>>>       - None
>>>>
>>>> "Sender" daemon prio=10 tid=0x000000004823f000 nid=0x2a0b in
>>>> Object.wait() [0x00000000446c5000]
>>>>      java.lang.Thread.State: WAITING (on object monitor)
>>>>       at java.lang.Object.wait(Native Method)
>>>>       at java.lang.Object.wait(Object.java:485)
>>>>       at org.globus.cog.karajan.workflow.service.channels.AbstractStreamKarajanChannel$Sender.run(AbstractStreamKarajanChannel.java:241)
>>>>       - locked<0x00002aaab5490a50>   (a
>>>> org.globus.cog.karajan.workflow.service.channels.AbstractStreamKarajanChannel$Sender)
>>>>
>>>>      Locked ownable synchronizers:
>>>>       - None
>>>>
>>>> "PullThread" daemon prio=10 tid=0x0000000048240800 nid=0x2a0a in
>>>> Object.wait() [0x00000000445c4000]
>>>>      java.lang.Thread.State: TIMED_WAITING (on object monitor)
>>>>       at java.lang.Object.wait(Native Method)
>>>>       at org.globus.cog.abstraction.coaster.service.job.manager.PullThread.mwait(PullThread.java:86)
>>>>       at org.globus.cog.abstraction.coaster.service.job.manager.PullThread.run(PullThread.java:57)
>>>>       - locked<0x00002aaab5490d28>   (a
>>>> org.globus.cog.abstraction.coaster.service.job.manager.PullThread)
>>>>
>>>>      Locked ownable synchronizers:
>>>>       - None
>>>>
>>>> "Channel multiplexer 1" daemon prio=10 tid=0x00000000479a2800
>>>> nid=0x2a08 sleeping[0x00000000443c2000]
>>>>      java.lang.Thread.State: TIMED_WAITING (sleeping)
>>>>       at java.lang.Thread.sleep(Native Method)
>>>>       at org.globus.cog.karajan.workflow.service.channels.AbstractStreamKarajanChannel$Multiplexer.run(AbstractStreamKarajanChannel.java:418)
>>>>
>>>>      Locked ownable synchronizers:
>>>>       - None
>>>>
>>>> "Channel multiplexer 0" daemon prio=10 tid=0x0000000047a62800
>>>> nid=0x2a07 sleeping[0x00000000444c3000]
>>>>      java.lang.Thread.State: TIMED_WAITING (sleeping)
>>>>       at java.lang.Thread.sleep(Native Method)
>>>>       at org.globus.cog.karajan.workflow.service.channels.AbstractStreamKarajanChannel$Multiplexer.run(AbstractStreamKarajanChannel.java:418)
>>>>
>>>>      Locked ownable synchronizers:
>>>>       - None
>>>>
>>>> "Timer-3" daemon prio=10 tid=0x0000000047a62000 nid=0x2a06 in
>>>> Object.wait() [0x0000000043ebd000]
>>>>      java.lang.Thread.State: TIMED_WAITING (on object monitor)
>>>>       at java.lang.Object.wait(Native Method)
>>>>       at java.util.TimerThread.mainLoop(Timer.java:509)
>>>>       - locked<0x00002aaab54afbf0>   (a java.util.TaskQueue)
>>>>       at java.util.TimerThread.run(Timer.java:462)
>>>>
>>>>      Locked ownable synchronizers:
>>>>       - None
>>>>
>>>> "PBS provider queue poller" daemon prio=10 tid=0x0000000048405000
>>>> nid=0x29bd sleeping[0x0000000043dbc000]
>>>>      java.lang.Thread.State: TIMED_WAITING (sleeping)
>>>>       at java.lang.Thread.sleep(Native Method)
>>>>       at org.globus.cog.abstraction.impl.scheduler.common.AbstractQueuePoller.run(AbstractQueuePoller.java:76)
>>>>       at java.lang.Thread.run(Thread.java:662)
>>>>
>>>>      Locked ownable synchronizers:
>>>>       - None
>>>>
>>>> "Block Submitter" daemon prio=10 tid=0x00002aacc4016800 nid=0x2978 in
>>>> Object.wait() [0x00000000441c0000]
>>>>      java.lang.Thread.State: WAITING (on object monitor)
>>>>       at java.lang.Object.wait(Native Method)
>>>>       at java.lang.Object.wait(Object.java:485)
>>>>       at org.globus.cog.abstraction.coaster.service.job.manager.BlockTaskSubmitter.run(BlockTaskSubmitter.java:54)
>>>>       - locked<0x00002aaab54d4510>   (a java.util.LinkedList)
>>>>
>>>>      Locked ownable synchronizers:
>>>>       - None
>>>>
>>>> "Timer-2" daemon prio=10 tid=0x00000000483e7800 nid=0x2952 in
>>>> Object.wait() [0x0000000043cbb000]
>>>>      java.lang.Thread.State: WAITING (on object monitor)
>>>>       at java.lang.Object.wait(Native Method)
>>>>       at java.lang.Object.wait(Object.java:485)
>>>>       at java.util.TimerThread.mainLoop(Timer.java:483)
>>>>       - locked<0x00002aaab54caa88>   (a java.util.TaskQueue)
>>>>       at java.util.TimerThread.run(Timer.java:462)
>>>>
>>>>      Locked ownable synchronizers:
>>>>       - None
>>>>
>>>> "Piped Channel Sender" daemon prio=10 tid=0x0000000048403800
>>>> nid=0x2951 in Object.wait() [0x0000000043bba000]
>>>>      java.lang.Thread.State: WAITING (on object monitor)
>>>>       at java.lang.Object.wait(Native Method)
>>>>       at java.lang.Object.wait(Object.java:485)
>>>>       at edu.emory.mathcs.backport.java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:315)
>>>>       - locked<0x00002aaab54acd38>   (a
>>>> edu.emory.mathcs.backport.java.util.concurrent.LinkedBlockingQueue$SerializableLock)
>>>>       at org.globus.cog.karajan.workflow.service.channels.AbstractPipedChannel$Sender.run(AbstractPipedChannel.java:113)
>>>>
>>>>      Locked ownable synchronizers:
>>>>       - None
>>>>
>>>> "Piped Channel Sender" daemon prio=10 tid=0x0000000047a04800
>>>> nid=0x2950 in Object.wait() [0x0000000043ab9000]
>>>>      java.lang.Thread.State: WAITING (on object monitor)
>>>>       at java.lang.Object.wait(Native Method)
>>>>       at java.lang.Object.wait(Object.java:485)
>>>>       at edu.emory.mathcs.backport.java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:315)
>>>>       - locked<0x00002aaab54ac848>   (a
>>>> edu.emory.mathcs.backport.java.util.concurrent.LinkedBlockingQueue$SerializableLock)
>>>>       at org.globus.cog.karajan.workflow.service.channels.AbstractPipedChannel$Sender.run(AbstractPipedChannel.java:113)
>>>>
>>>>      Locked ownable synchronizers:
>>>>       - None
>>>>
>>>> "Local Queue Processor" daemon prio=10 tid=0x0000000048407000
>>>> nid=0x294f in Object.wait() [0x00000000439b8000]
>>>>      java.lang.Thread.State: WAITING (on object monitor)
>>>>       at java.lang.Object.wait(Native Method)
>>>>       - waiting on<0x00002aaab548bed8>   (a org.globus.cog.karajan.util.Queue)
>>>>       at java.lang.Object.wait(Object.java:485)
>>>>       at org.globus.cog.karajan.util.Queue.take(Queue.java:46)
>>>>       - locked<0x00002aaab548bed8>   (a org.globus.cog.karajan.util.Queue)
>>>>       at org.globus.cog.abstraction.coaster.service.job.manager.AbstractQueueProcessor.take(AbstractQueueProcessor.java:51)
>>>>       at org.globus.cog.abstraction.coaster.service.job.manager.LocalQueueProcessor.run(LocalQueueProcessor.java:37)
>>>>
>>>>      Locked ownable synchronizers:
>>>>       - None
>>>>
>>>> "Server: http://192.5.86.6:46247" daemon prio=10
>>>> tid=0x0000000047b66000 nid=0x294e runnable [0x00000000438b7000]
>>>>      java.lang.Thread.State: RUNNABLE
>>>>       at java.net.PlainSocketImpl.socketAccept(Native Method)
>>>>       at java.net.PlainSocketImpl.accept(PlainSocketImpl.java:390)
>>>>       - locked<0x00002aaab5492b68>   (a java.net.SocksSocketImpl)
>>>>       at java.net.ServerSocket.implAccept(ServerSocket.java:453)
>>>>       at java.net.ServerSocket.accept(ServerSocket.java:421)
>>>>       at org.globus.net.BaseServer.run(BaseServer.java:226)
>>>>       at java.lang.Thread.run(Thread.java:662)
>>>>
>>>>      Locked ownable synchronizers:
>>>>       - None
>>>>
>>>> "Timer-1" daemon prio=10 tid=0x00000000487ab000 nid=0x294c in
>>>> Object.wait() [0x00000000436b5000]
>>>>      java.lang.Thread.State: TIMED_WAITING (on object monitor)
>>>>       at java.lang.Object.wait(Native Method)
>>>>       at java.util.TimerThread.mainLoop(Timer.java:509)
>>>>       - locked<0x00002aaab5518710>   (a java.util.TaskQueue)
>>>>       at java.util.TimerThread.run(Timer.java:462)
>>>>
>>>>      Locked ownable synchronizers:
>>>>       - None
>>>>
>>>> "Coaster Bootstrap Service Connection Processor" daemon prio=10
>>>> tid=0x0000000047c99000 nid=0x294a runnable [0x00000000435b4000]
>>>>      java.lang.Thread.State: RUNNABLE
>>>>       at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
>>>>       at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:210)
>>>>       at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65)
>>>>       at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69)
>>>>       - locked<0x00002aaab5474d40>   (a sun.nio.ch.Util$1)
>>>>       - locked<0x00002aaab5474d28>   (a java.util.Collections$UnmodifiableSet)
>>>>       - locked<0x00002aaab5474998>   (a sun.nio.ch.EPollSelectorImpl)
>>>>       at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80)
>>>>       at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:84)
>>>>       at org.globus.cog.abstraction.impl.execution.coaster.BootstrapService$ConnectionProcessor.run(BootstrapService.java:231)
>>>>       at java.lang.Thread.run(Thread.java:662)
>>>>
>>>>      Locked ownable synchronizers:
>>>>       - None
>>>>
>>>> "Coaster Bootstrap Service Thread" daemon prio=10
>>>> tid=0x0000000047c30800 nid=0x2949 runnable [0x00000000434b3000]
>>>>      java.lang.Thread.State: RUNNABLE
>>>>       at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method)
>>>>       at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:145)
>>>>       - locked<0x00002aaab54746f8>   (a java.lang.Object)
>>>>       at org.globus.cog.abstraction.impl.execution.coaster.BootstrapService.run(BootstrapService.java:184)
>>>>       at java.lang.Thread.run(Thread.java:662)
>>>>
>>>>      Locked ownable synchronizers:
>>>>       - None
>>>>
>>>> "Local service" daemon prio=10 tid=0x0000000047c49800 nid=0x2948
>>>> runnable [0x00000000433b2000]
>>>>      java.lang.Thread.State: RUNNABLE
>>>>       at java.net.PlainSocketImpl.socketAccept(Native Method)
>>>>       at java.net.PlainSocketImpl.accept(PlainSocketImpl.java:390)
>>>>       - locked<0x00002aaab5489de8>   (a java.net.SocksSocketImpl)
>>>>       at java.net.ServerSocket.implAccept(ServerSocket.java:453)
>>>>       at java.net.ServerSocket.accept(ServerSocket.java:421)
>>>>       at org.globus.net.BaseServer.run(BaseServer.java:226)
>>>>       at java.lang.Thread.run(Thread.java:662)
>>>>
>>>>      Locked ownable synchronizers:
>>>>       - None
>>>>
>>>> "Scheduler" prio=10 tid=0x00002aacc01ae800 nid=0x28b5 in Object.wait()
>>>> [0x0000000042dac000]
>>>>      java.lang.Thread.State: TIMED_WAITING (on object monitor)
>>>>       at java.lang.Object.wait(Native Method)
>>>>       at org.globus.cog.karajan.scheduler.LateBindingScheduler.sleep(LateBindingScheduler.java:305)
>>>>       at org.globus.cog.karajan.scheduler.LateBindingScheduler.run(LateBindingScheduler.java:289)
>>>>       - locked<0x00002aaab500e070>   (a
>>>> org.griphyn.vdl.karajan.VDSAdaptiveScheduler)
>>>>
>>>>      Locked ownable synchronizers:
>>>>       - None
>>>>
>>>> "Progress ticker" daemon prio=10 tid=0x000000004821d000 nid=0x281e
>>>> waiting on condition [0x0000000042cab000]
>>>>      java.lang.Thread.State: TIMED_WAITING (sleeping)
>>>>       at java.lang.Thread.sleep(Native Method)
>>>>       at org.griphyn.vdl.karajan.lib.RuntimeStats$ProgressTicker.run(RuntimeStats.java:137)
>>>>
>>>>      Locked ownable synchronizers:
>>>>       - None
>>>>
>>>> "Restart Log Sync" daemon prio=10 tid=0x0000000048219800 nid=0x281d in
>>>> Object.wait() [0x0000000042baa000]
>>>>      java.lang.Thread.State: WAITING (on object monitor)
>>>>       at java.lang.Object.wait(Native Method)
>>>>       at java.lang.Object.wait(Object.java:485)
>>>>       at org.globus.cog.karajan.workflow.nodes.restartLog.SyncThread.run(SyncThread.java:45)
>>>>       - locked<0x00002aaab4b71708>   (a
>>>> org.globus.cog.karajan.workflow.nodes.restartLog.SyncThread)
>>>>
>>>>      Locked ownable synchronizers:
>>>>       - None
>>>>
>>>> "Overloaded Host Monitor" daemon prio=10 tid=0x00000000486b7000
>>>> nid=0x2819 waiting on condition [0x0000000042aa9000]
>>>>      java.lang.Thread.State: TIMED_WAITING (sleeping)
>>>>       at java.lang.Thread.sleep(Native Method)
>>>>       at org.globus.cog.karajan.scheduler.OverloadedHostMonitor.run(OverloadedHostMonitor.java:47)
>>>>
>>>>      Locked ownable synchronizers:
>>>>       - None
>>>>
>>>> "Timer-0" daemon prio=10 tid=0x0000000048632000 nid=0x2816 in
>>>> Object.wait() [0x00000000429a8000]
>>>>      java.lang.Thread.State: TIMED_WAITING (on object monitor)
>>>>       at java.lang.Object.wait(Native Method)
>>>>       at java.util.TimerThread.mainLoop(Timer.java:509)
>>>>       - locked<0x00002aaab51e2ea0>   (a java.util.TaskQueue)
>>>>       at java.util.TimerThread.run(Timer.java:462)
>>>>
>>>>      Locked ownable synchronizers:
>>>>       - None
>>>>
>>>> "pool-1-thread-8" prio=10 tid=0x000000004849c000 nid=0x2814 in
>>>> Object.wait() [0x00000000427a6000]
>>>>      java.lang.Thread.State: WAITING (on object monitor)
>>>>       at java.lang.Object.wait(Native Method)
>>>>       at java.lang.Object.wait(Object.java:485)
>>>>       at edu.emory.mathcs.backport.java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:315)
>>>>       - locked<0x00002aaab405e590>   (a
>>>> edu.emory.mathcs.backport.java.util.concurrent.LinkedBlockingQueue$SerializableLock)
>>>>       at edu.emory.mathcs.backport.java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:470)
>>>>       at edu.emory.mathcs.backport.java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:667)
>>>>       at java.lang.Thread.run(Thread.java:662)
>>>>
>>>>      Locked ownable synchronizers:
>>>>       - None
>>>>
>>>> "pool-1-thread-7" prio=10 tid=0x000000004807e800 nid=0x2813 in
>>>> Object.wait() [0x00000000426a5000]
>>>>      java.lang.Thread.State: WAITING (on object monitor)
>>>>       at java.lang.Object.wait(Native Method)
>>>>       at java.lang.Object.wait(Object.java:485)
>>>>       at edu.emory.mathcs.backport.java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:315)
>>>>       - locked<0x00002aaab405e590>   (a
>>>> edu.emory.mathcs.backport.java.util.concurrent.LinkedBlockingQueue$SerializableLock)
>>>>       at edu.emory.mathcs.backport.java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:470)
>>>>       at edu.emory.mathcs.backport.java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:667)
>>>>       at java.lang.Thread.run(Thread.java:662)
>>>>
>>>>      Locked ownable synchronizers:
>>>>       - None
>>>>
>>>> "pool-1-thread-6" prio=10 tid=0x000000004855f000 nid=0x2812 in
>>>> Object.wait() [0x00000000425a4000]
>>>>      java.lang.Thread.State: WAITING (on object monitor)
>>>>       at java.lang.Object.wait(Native Method)
>>>>       at java.lang.Object.wait(Object.java:485)
>>>>       at edu.emory.mathcs.backport.java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:315)
>>>>       - locked<0x00002aaab405e590>   (a
>>>> edu.emory.mathcs.backport.java.util.concurrent.LinkedBlockingQueue$SerializableLock)
>>>>       at edu.emory.mathcs.backport.java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:470)
>>>>       at edu.emory.mathcs.backport.java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:667)
>>>>       at java.lang.Thread.run(Thread.java:662)
>>>>
>>>>      Locked ownable synchronizers:
>>>>       - None
>>>>
>>>> "pool-1-thread-5" prio=10 tid=0x00000000486c9800 nid=0x2811 in
>>>> Object.wait() [0x00000000424a3000]
>>>>      java.lang.Thread.State: WAITING (on object monitor)
>>>>       at java.lang.Object.wait(Native Method)
>>>>       at java.lang.Object.wait(Object.java:485)
>>>>       at edu.emory.mathcs.backport.java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:315)
>>>>       - locked<0x00002aaab405e590>   (a
>>>> edu.emory.mathcs.backport.java.util.concurrent.LinkedBlockingQueue$SerializableLock)
>>>>       at edu.emory.mathcs.backport.java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:470)
>>>>       at edu.emory.mathcs.backport.java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:667)
>>>>       at java.lang.Thread.run(Thread.java:662)
>>>>
>>>>      Locked ownable synchronizers:
>>>>       - None
>>>>
>>>> "pool-1-thread-4" prio=10 tid=0x00000000486c8000 nid=0x2810 in
>>>> Object.wait() [0x00000000423a2000]
>>>>      java.lang.Thread.State: WAITING (on object monitor)
>>>>       at java.lang.Object.wait(Native Method)
>>>>       at java.lang.Object.wait(Object.java:485)
>>>>       at edu.emory.mathcs.backport.java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:315)
>>>>       - locked<0x00002aaab405e590>   (a
>>>> edu.emory.mathcs.backport.java.util.concurrent.LinkedBlockingQueue$SerializableLock)
>>>>       at edu.emory.mathcs.backport.java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:470)
>>>>       at edu.emory.mathcs.backport.java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:667)
>>>>       at java.lang.Thread.run(Thread.java:662)
>>>>
>>>>      Locked ownable synchronizers:
>>>>       - None
>>>>
>>>> "pool-1-thread-3" prio=10 tid=0x0000000048491800 nid=0x280f in
>>>> Object.wait() [0x00000000422a1000]
>>>>      java.lang.Thread.State: WAITING (on object monitor)
>>>>       at java.lang.Object.wait(Native Method)
>>>>       at java.lang.Object.wait(Object.java:485)
>>>>       at edu.emory.mathcs.backport.java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:315)
>>>>       - locked<0x00002aaab405e590>   (a
>>>> edu.emory.mathcs.backport.java.util.concurrent.LinkedBlockingQueue$SerializableLock)
>>>>       at edu.emory.mathcs.backport.java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:470)
>>>>       at edu.emory.mathcs.backport.java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:667)
>>>>       at java.lang.Thread.run(Thread.java:662)
>>>>
>>>>      Locked ownable synchronizers:
>>>>       - None
>>>>
>>>> "pool-1-thread-2" prio=10 tid=0x00000000482d8800 nid=0x280e in
>>>> Object.wait() [0x00000000412dd000]
>>>>      java.lang.Thread.State: WAITING (on object monitor)
>>>>       at java.lang.Object.wait(Native Method)
>>>>       at java.lang.Object.wait(Object.java:485)
>>>>       at edu.emory.mathcs.backport.java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:315)
>>>>       - locked<0x00002aaab405e590>   (a
>>>> edu.emory.mathcs.backport.java.util.concurrent.LinkedBlockingQueue$SerializableLock)
>>>>       at edu.emory.mathcs.backport.java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:470)
>>>>       at edu.emory.mathcs.backport.java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:667)
>>>>       at java.lang.Thread.run(Thread.java:662)
>>>>
>>>>      Locked ownable synchronizers:
>>>>       - None
>>>>
>>>> "pool-1-thread-1" prio=10 tid=0x00002aacc0018000 nid=0x280d in
>>>> Object.wait() [0x000000004104d000]
>>>>      java.lang.Thread.State: WAITING (on object monitor)
>>>>       at java.lang.Object.wait(Native Method)
>>>>       at java.lang.Object.wait(Object.java:485)
>>>>       at edu.emory.mathcs.backport.java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:315)
>>>>       - locked<0x00002aaab405e590>   (a
>>>> edu.emory.mathcs.backport.java.util.concurrent.LinkedBlockingQueue$SerializableLock)
>>>>       at edu.emory.mathcs.backport.java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:470)
>>>>       at edu.emory.mathcs.backport.java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:667)
>>>>       at java.lang.Thread.run(Thread.java:662)
>>>>
>>>>      Locked ownable synchronizers:
>>>>       - None
>>>>
>>>> "Low Memory Detector" daemon prio=10 tid=0x00002aacb8026000 nid=0x2808
>>>> runnable [0x0000000000000000]
>>>>      java.lang.Thread.State: RUNNABLE
>>>>
>>>>      Locked ownable synchronizers:
>>>>       - None
>>>>
>>>> "CompilerThread1" daemon prio=10 tid=0x00002aacb8023800 nid=0x2807
>>>> waiting on condition [0x0000000000000000]
>>>>      java.lang.Thread.State: RUNNABLE
>>>>
>>>>      Locked ownable synchronizers:
>>>>       - None
>>>>
>>>> "CompilerThread0" daemon prio=10 tid=0x00002aacb8020800 nid=0x2806
>>>> waiting on condition [0x0000000000000000]
>>>>      java.lang.Thread.State: RUNNABLE
>>>>
>>>>      Locked ownable synchronizers:
>>>>       - None
>>>>
>>>> "Signal Dispatcher" daemon prio=10 tid=0x00002aacb801e000 nid=0x2805
>>>> runnable [0x0000000000000000]
>>>>      java.lang.Thread.State: RUNNABLE
>>>>
>>>>      Locked ownable synchronizers:
>>>>       - None
>>>>
>>>> "Finalizer" daemon prio=10 tid=0x000000004796c000 nid=0x2804 in
>>>> Object.wait() [0x0000000041f9e000]
>>>>      java.lang.Thread.State: WAITING (on object monitor)
>>>>       at java.lang.Object.wait(Native Method)
>>>>       at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:118)
>>>>       - locked<0x00002aaab3e096b8>   (a java.lang.ref.ReferenceQueue$Lock)
>>>>       at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:134)
>>>>       at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:159)
>>>>
>>>>      Locked ownable synchronizers:
>>>>       - None
>>>>
>>>> "Reference Handler" daemon prio=10 tid=0x0000000047965000 nid=0x2803
>>>> in Object.wait() [0x0000000041c29000]
>>>>      java.lang.Thread.State: WAITING (on object monitor)
>>>>       at java.lang.Object.wait(Native Method)
>>>>       at java.lang.Object.wait(Object.java:485)
>>>>       at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:116)
>>>>       - locked<0x00002aaab3e09630>   (a java.lang.ref.Reference$Lock)
>>>>
>>>>      Locked ownable synchronizers:
>>>>       - None
>>>>
>>>> "main" prio=10 tid=0x00000000478fb800 nid=0x27f9 in Object.wait()
>>>> [0x0000000040dd9000]
>>>>      java.lang.Thread.State: WAITING (on object monitor)
>>>>       at java.lang.Object.wait(Native Method)
>>>>       - waiting on<0x00002aaab47b9dc0>   (a
>>>> org.griphyn.vdl.karajan.VDL2ExecutionContext)
>>>>       at java.lang.Object.wait(Object.java:485)
>>>>       at org.globus.cog.karajan.workflow.ExecutionContext.waitFor(ExecutionContext.java:261)
>>>>       - locked<0x00002aaab47b9dc0>   (a
>>>> org.griphyn.vdl.karajan.VDL2ExecutionContext)
>>>>       at org.griphyn.vdl.karajan.Loader.main(Loader.java:197)
>>>>
>>>>      Locked ownable synchronizers:
>>>>       - None
>>>>
>>>> "VM Thread" prio=10 tid=0x0000000047960800 nid=0x2802 runnable
>>>>
>>>> "GC task thread#0 (ParallelGC)" prio=10 tid=0x000000004790e800
>>>> nid=0x27fa runnable
>>>>
>>>> "GC task thread#1 (ParallelGC)" prio=10 tid=0x0000000047910000
>>>> nid=0x27fb runnable
>>>>
>>>> "GC task thread#2 (ParallelGC)" prio=10 tid=0x0000000047912000
>>>> nid=0x27fc runnable
>>>>
>>>> "GC task thread#3 (ParallelGC)" prio=10 tid=0x0000000047914000
>>>> nid=0x27fd runnable
>>>>
>>>> "GC task thread#4 (ParallelGC)" prio=10 tid=0x0000000047915800
>>>> nid=0x27fe runnable
>>>>
>>>> "GC task thread#5 (ParallelGC)" prio=10 tid=0x0000000047917800
>>>> nid=0x27ff runnable
>>>>
>>>> "GC task thread#6 (ParallelGC)" prio=10 tid=0x0000000047919800
>>>> nid=0x2800 runnable
>>>>
>>>> "GC task thread#7 (ParallelGC)" prio=10 tid=0x000000004791b000
>>>> nid=0x2801 runnable
>>>>
>>>> "VM Periodic Task Thread" prio=10 tid=0x00002aacb8038800 nid=0x2809
>>>> waiting on condition
>>>>
>>>> JNI global references: 1451
>>>>
>>>>
>>>>
>>>> On 1/5/11 2:06 PM, Allan Espinosa wrote:
>>>>
>>>> Hi jon,
>>>>
>>>> Could you post a jstack trace? It should indicate if the code has deadlocks.
>>>>
>>>> -Allan (mobile)
>>>>
>>>> On Jan 5, 2011 4:50 PM, "Jonathan Monette"<jon.monette at gmail.com>   wrote:
>>>>> Hello,
>>>>>     I have encountered swift hanging.  The deadlock appears to be in the same place every time.  This deadlock does seem to be intermittent since smaller work sizes does complete.  This job size is with approximately 1200 files.  The behavior that the logs show is that the files needed for the job submission are staged in but no jobs are submitted.  The Coaster heartbeat that appears in the swift logs shows that the job queue is empty.  The logs for the runs are in ~jonmon/Workspace/Swift/Montage/m101_j_6x6/run.000[1,2,3]  I will try to recreate the problem using simple cat jobs.
>>>>>
>>>>> --
>>>>> Jon
>>>>>
>>>>> Computers are incredibly fast, accurate, and stupid. Human beings are incredibly slow, inaccurate, and brilliant. Together they are powerful beyond imagination.
>>>>> - Albert Einstein
>>>>>
>>>>> _______________________________________________
>>>>> Swift-devel mailing list
>>>>> Swift-devel at ci.uchicago.edu
>>>>> http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
>>>>>
>>>> -- 
>>>> Jon
>>>>
>>>> Computers are incredibly fast, accurate, and stupid. Human beings are
>>>> incredibly slow, inaccurate, and brilliant. Together they are powerful
>>>> beyond imagination.
>>>> - Albert Einstein
>>>>
>>>>
>>>>
>>> _______________________________________________
>>> Swift-devel mailing list
>>> Swift-devel at ci.uchicago.edu
>>> http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
>

-- 
Jon

Computers are incredibly fast, accurate, and stupid. Human beings are incredibly slow, inaccurate, and brilliant. Together they are powerful beyond imagination.
- Albert Einstein




More information about the Swift-devel mailing list