[Swift-devel] [Bug 243] New: Block Submitter error when using SGE and coasters

bugzilla-daemon at mcs.anl.gov bugzilla-daemon at mcs.anl.gov
Mon Jan 10 21:37:11 CST 2011


https://bugzilla.mcs.anl.gov/swift/show_bug.cgi?id=243

           Summary: Block Submitter error when using SGE and coasters
           Product: Swift
           Version: unspecified
          Platform: PC
        OS/Version: Linux
            Status: NEW
          Severity: normal
          Priority: P2
         Component: General
        AssignedTo: hategan at mcs.anl.gov
        ReportedBy: dk0966 at cs.ship.edu


When using a configuration with SGE and coasters, an exception is thrown
related to Block Submitter.

RunID: 20110110-2105-nuakb162
Progress:
Exception in thread "Block Submitter" java.lang.NullPointerException
    at
org.globus.cog.abstraction.coaster.service.job.manager.Cpu.taskFailed(Cpu.java:302)
    at
org.globus.cog.abstraction.coaster.service.job.manager.Block.taskFailed(Block.java:330)
    at
org.globus.cog.abstraction.coaster.service.job.manager.BlockTaskSubmitter.run(BlockTaskSubmitter.java:76)
Failed to shut down block: Block 0110-050925-000000 (4x3600.000s)
org.globus.cog.abstraction.impl.common.task.TaskSubmissionException: Can only
cancel an active task
    at
org.globus.cog.abstraction.impl.scheduler.common.AbstractExecutor.cancel(AbstractExecutor.java:179)
    at
org.globus.cog.abstraction.impl.scheduler.common.AbstractJobSubmissionTaskHandler.cancel(AbstractJobSubmissionTaskHandler.java:85)
    at
org.globus.cog.abstraction.impl.common.AbstractTaskHandler.cancel(AbstractTaskHandler.java:69)
    at
org.globus.cog.abstraction.impl.common.task.ExecutionTaskHandler.cancel(ExecutionTaskHandler.java:102)
    at
org.globus.cog.abstraction.impl.common.task.ExecutionTaskHandler.cancel(ExecutionTaskHandler.java:91)
    at
org.globus.cog.abstraction.coaster.service.job.manager.BlockTaskSubmitter.cancel(BlockTaskSubmitter.java:45)
    at
org.globus.cog.abstraction.coaster.service.job.manager.Block.forceShutdown(Block.java:302)
    at
org.globus.cog.abstraction.coaster.service.job.manager.Block.shutdown(Block.java:282)
    at
org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.cleanDoneBlocks(BlockQueueProcessor.java:177)
    at
org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.updatePlan(BlockQueueProcessor.java:496)
    at
org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.run(BlockQueueProcessor.java:100)

Previous versions of swift also sometimes threw exceptions with this
configuration. They were usually related to changes in the formatting of qstat
or the "pe" settings not being interpreted correctly. Two patches exist which
should fix these - my patch for reading qstat information as xml, and Mike's
patch for the pe settings. This particular error seems unrelated - showing up
before and after the other patches are applied.

Tested with swift 0.92 on ibicluster using the following configuration files

sites.xml:
<config>
 <pool handle="sge-coasters">
  <execution provider="coaster" url="none" jobmanager="local:sge"/>
  <profile namespace="globus" key="pe">threaded</profile>
  <profile namespace="globus" key="workersPerNode">4</profile>
  <profile namespace="globus" key="slots">128</profile>
  <profile namespace="globus" key="nodeGranularity">1</profile>
  <profile namespace="globus" key="maxnodes">1</profile>
  <profile namespace="karajan" key="jobThrottle">5.11</profile>
  <profile namespace="karajan" key="initialScore">10000</profile>
  <filesystem provider="local" url="none"/>
  <workdirectory>/cchome/dkelly/swiftwork</workdirectory>
 </pool>
</config>

tc.data:
sge-coasters     echo         /bin/echo    INSTALLED    INTEL32::LINUX
sge-coasters     cat         /bin/cat    INSTALLED    INTEL32::LINUX
sge-coasters     ls         /bin/ls        INSTALLED    INTEL32::LINUX
sge-coasters     grep         /bin/grep    INSTALLED    INTEL32::LINUX
sge-coasters     sort         /bin/sort    INSTALLED    INTEL32::LINUX
sge-coasters     paste         /bin/paste    INSTALLED    INTEL32::LINUX
sge-coasters    wc        /usr/bin/wc    INSTALLED    INTEL32::LINUX

catsn.swift:
type file;

app (file o) cat (file i)
{
  cat @i stdout=@o;
}

string t = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789";
string char[] = @strsplit(t, "");  

file out[]<simple_mapper; location=".", prefix="catsn.",suffix=".out">;
foreach j in [1:@toint(@arg("n","10"))] {
  file data<"data.txt">;
  out[j] = cat(data);
}

-- 
Configure bugmail: https://bugzilla.mcs.anl.gov/swift/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.



More information about the Swift-devel mailing list