[Swift-user] CPU was not in the block (Swift 0.96)

Jason James Pitt pittjj at uchicago.edu
Tue Mar 10 09:08:39 CDT 2015


Hi Mike,

Thanks for a quick reply! I am preparing the files now, and will point you guys to them once everything is ready.

Regarding ill effects, I can't confirm for sure, but it looks like tasks are remaining in the submitted state and are not becoming active. In addition, I think the process itself is in a zombie state. The few jobs that are active should be finishing, but they are not. qstat is showing I have 10 workers active, but according to the screen log by swift I only have 6 active tasks (the jobs are one node jobs). Control^c isn't effective so I'm going to have to use the pid to kill the swift/java process.

Best,

Jason
________________________________________
From: swift-user-bounces at ci.uchicago.edu [swift-user-bounces at ci.uchicago.edu] on behalf of Michael Wilde [wilde at anl.gov]
Sent: Tuesday, March 10, 2015 8:56 AM
To: swift-user at ci.uchicago.edu
Subject: Re: [Swift-user] CPU was not in the block (Swift 0.96)

Jason, we'll almost certainly need a log for this; feel free to send a
pointer, off-list.
Ideally for the run that produced the traceback below.

What happens when this occurs?  Does the run continue?  Any observable
ill effects?

Thanks,

- Mike

On 3/10/15 8:47 AM, Jason James Pitt wrote:
> Hi Everyone,
>
> I've sporadically seen the following exception (or similar) sporadically in some of the runs I've been performing recently. Any sense of what this may mean and is there something I can do on my end to prevent it? I can pass along the logs if that'd be helpful (though this particular run is still active). Thanks!
>
> Jason
>
> CoasterService fatal error:
> CPU was not in the block
> java.lang.Throwable
>       at org.globus.cog.abstraction.coaster.service.job.manager.Block.remove(Block.java:209)
>       at org.globus.cog.abstraction.coaster.service.job.manager.Cpu.jobTerminated(Cpu.java:115)
>       at org.globus.cog.abstraction.coaster.service.job.manager.Cpu.statusChanged(Cpu.java:433)
>       at org.globus.cog.abstraction.impl.execution.coaster.NotificationManager.notificationReceived(NotificationManager.java:117)
>       at org.globus.cog.abstraction.coaster.service.local.JobStatusHandler.requestComplete(JobStatusHandler.java:81)
>       at org.globus.cog.coaster.handlers.RequestHandler.receiveCompleted(RequestHandler.java:112)
>       at org.globus.cog.coaster.channels.AbstractCoasterChannel.handleRequest(AbstractCoasterChannel.java:590)
>       at org.globus.cog.coaster.channels.AbstractStreamCoasterChannel.stepNIO(AbstractStreamCoasterChannel.java:240)
>       at org.globus.cog.coaster.channels.NIOMultiplexer.loop(NIOMultiplexer.java:116)
>       at org.globus.cog.coaster.channels.NIOMultiplexer.run(NIOMultiplexer.java:75)
> CPU was not in the block
> java.lang.Throwable
>       at org.globus.cog.abstraction.coaster.service.job.manager.Block.remove(Block.java:209)
>       at org.globus.cog.abstraction.coaster.service.job.manager.Cpu.jobTerminated(Cpu.java:115)
>       at org.globus.cog.abstraction.coaster.service.job.manager.Cpu.statusChanged(Cpu.java:433)
>       at org.globus.cog.abstraction.impl.execution.coaster.NotificationManager.notificationReceived(NotificationManager.java:117)
>       at org.globus.cog.abstraction.coaster.service.local.JobStatusHandler.requestComplete(JobStatusHandler.java:81)
>       at org.globus.cog.coaster.handlers.RequestHandler.receiveCompleted(RequestHandler.java:112)
>       at org.globus.cog.coaster.channels.AbstractCoasterChannel.handleRequest(AbstractCoasterChannel.java:590)
>       at org.globus.cog.coaster.channels.AbstractStreamCoasterChannel.stepNIO(AbstractStreamCoasterChannel.java:240)
>       at org.globus.cog.coaster.channels.NIOMultiplexer.loop(NIOMultiplexer.java:116)
>       at org.globus.cog.coaster.channels.NIOMultiplexer.run(NIOMultiplexer.java:75)
> _______________________________________________
> Swift-user mailing list
> Swift-user at ci.uchicago.edu
> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user

--
Michael Wilde
Mathematics and Computer Science          Computation Institute
Argonne National Laboratory               The University of Chicago

_______________________________________________
Swift-user mailing list
Swift-user at ci.uchicago.edu
https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user



More information about the Swift-user mailing list