[Swift-devel] [Bug 243] New: Block Submitter error when using SGE and coasters
bugzilla-daemon at mcs.anl.gov
bugzilla-daemon at mcs.anl.gov
Mon Jan 10 21:37:11 CST 2011
https://bugzilla.mcs.anl.gov/swift/show_bug.cgi?id=243
Summary: Block Submitter error when using SGE and coasters
Product: Swift
Version: unspecified
Platform: PC
OS/Version: Linux
Status: NEW
Severity: normal
Priority: P2
Component: General
AssignedTo: hategan at mcs.anl.gov
ReportedBy: dk0966 at cs.ship.edu
When using a configuration with SGE and coasters, an exception is thrown
related to Block Submitter.
RunID: 20110110-2105-nuakb162
Progress:
Exception in thread "Block Submitter" java.lang.NullPointerException
at
org.globus.cog.abstraction.coaster.service.job.manager.Cpu.taskFailed(Cpu.java:302)
at
org.globus.cog.abstraction.coaster.service.job.manager.Block.taskFailed(Block.java:330)
at
org.globus.cog.abstraction.coaster.service.job.manager.BlockTaskSubmitter.run(BlockTaskSubmitter.java:76)
Failed to shut down block: Block 0110-050925-000000 (4x3600.000s)
org.globus.cog.abstraction.impl.common.task.TaskSubmissionException: Can only
cancel an active task
at
org.globus.cog.abstraction.impl.scheduler.common.AbstractExecutor.cancel(AbstractExecutor.java:179)
at
org.globus.cog.abstraction.impl.scheduler.common.AbstractJobSubmissionTaskHandler.cancel(AbstractJobSubmissionTaskHandler.java:85)
at
org.globus.cog.abstraction.impl.common.AbstractTaskHandler.cancel(AbstractTaskHandler.java:69)
at
org.globus.cog.abstraction.impl.common.task.ExecutionTaskHandler.cancel(ExecutionTaskHandler.java:102)
at
org.globus.cog.abstraction.impl.common.task.ExecutionTaskHandler.cancel(ExecutionTaskHandler.java:91)
at
org.globus.cog.abstraction.coaster.service.job.manager.BlockTaskSubmitter.cancel(BlockTaskSubmitter.java:45)
at
org.globus.cog.abstraction.coaster.service.job.manager.Block.forceShutdown(Block.java:302)
at
org.globus.cog.abstraction.coaster.service.job.manager.Block.shutdown(Block.java:282)
at
org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.cleanDoneBlocks(BlockQueueProcessor.java:177)
at
org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.updatePlan(BlockQueueProcessor.java:496)
at
org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.run(BlockQueueProcessor.java:100)
Previous versions of swift also sometimes threw exceptions with this
configuration. They were usually related to changes in the formatting of qstat
or the "pe" settings not being interpreted correctly. Two patches exist which
should fix these - my patch for reading qstat information as xml, and Mike's
patch for the pe settings. This particular error seems unrelated - showing up
before and after the other patches are applied.
Tested with swift 0.92 on ibicluster using the following configuration files
sites.xml:
<config>
<pool handle="sge-coasters">
<execution provider="coaster" url="none" jobmanager="local:sge"/>
<profile namespace="globus" key="pe">threaded</profile>
<profile namespace="globus" key="workersPerNode">4</profile>
<profile namespace="globus" key="slots">128</profile>
<profile namespace="globus" key="nodeGranularity">1</profile>
<profile namespace="globus" key="maxnodes">1</profile>
<profile namespace="karajan" key="jobThrottle">5.11</profile>
<profile namespace="karajan" key="initialScore">10000</profile>
<filesystem provider="local" url="none"/>
<workdirectory>/cchome/dkelly/swiftwork</workdirectory>
</pool>
</config>
tc.data:
sge-coasters echo /bin/echo INSTALLED INTEL32::LINUX
sge-coasters cat /bin/cat INSTALLED INTEL32::LINUX
sge-coasters ls /bin/ls INSTALLED INTEL32::LINUX
sge-coasters grep /bin/grep INSTALLED INTEL32::LINUX
sge-coasters sort /bin/sort INSTALLED INTEL32::LINUX
sge-coasters paste /bin/paste INSTALLED INTEL32::LINUX
sge-coasters wc /usr/bin/wc INSTALLED INTEL32::LINUX
catsn.swift:
type file;
app (file o) cat (file i)
{
cat @i stdout=@o;
}
string t = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789";
string char[] = @strsplit(t, "");
file out[]<simple_mapper; location=".", prefix="catsn.",suffix=".out">;
foreach j in [1:@toint(@arg("n","10"))] {
file data<"data.txt">;
out[j] = cat(data);
}
--
Configure bugmail: https://bugzilla.mcs.anl.gov/swift/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.
More information about the Swift-devel
mailing list