[Swift-devel] Unexpected messages from coasters on BG/P

Mihael Hategan hategan at mcs.anl.gov
Wed Nov 11 12:40:21 CST 2009


On Wed, 2009-11-11 at 10:38 -0600, Michael Wilde wrote:
> Mihael, can you tell me what the messages below mean?
> 
> - the block ended prematurely message

That says that the block job completed before being commanded to shut
down. It's very likely that workers didn't even get started. It usually
indicates a problem with the queue parameters (maybe you forgot
kernel=zeptoos), but it's hard to tell without looking at cobalt logs.
It is also not a problem that cqsub would complain about, since this
only happens when the job is successfully queued.

> - the long java tracebacks (seems like one per each of 256 jobs?

That tells that the coaster provider doesn't yet implement job
canceling. Normally, this doesn't pop up. But if you have replication
enabled, and jobs get a chance to get replicated, you will see these
when the copies start to run.

You should disable replication. It's useless if only running on the
BG/P. In fact, the system should disable it automatically for
applications that are only present on one site.




More information about the Swift-devel mailing list