[Swift-devel] coasters submit jobs with "count=0" in its globus RSL params
Michael Wilde
wilde at mcs.anl.gov
Mon Jul 20 17:18:31 CDT 2009
Sarah, is this the same error you have been getting? (Invalid RSL count
field?)
- Mike
On 7/20/09 5:11 PM, Allan Espinosa wrote:
> session message:
> Caused by:
> Block task failed: Error submitting block task
> org.globus.cog.abstraction.impl.common.task.TaskSubmissionException:
> Cannot submit job
> at org.globus.cog.abstraction.impl.execution.gt2.JobSubmissionTaskHandler.submitSingleJob(JobSubmissionTaskHandler.java:146)
> at org.globus.cog.abstraction.impl.execution.gt2.JobSubmissionTaskHandler.submit(JobSubmissionTaskHandler.java:100)
> at org.globus.cog.abstraction.impl.common.AbstractTaskHandler.submit(AbstractTaskHandler.java:46)
> at org.globus.cog.abstraction.impl.common.task.ExecutionTaskHandler.submit(ExecutionTaskHandler.java:50)
> at org.globus.cog.abstraction.coaster.service.job.manager.BlockTaskSubmitter.run(BlockTaskSubmitter.java:66)
> Caused by: org.globus.gram.GramException: The provided RSL 'count'
> value is invalid (not an integer or must be greater than 0)
> at org.globus.gram.Gram.request(Gram.java:358)
> at org.globus.gram.GramJob.request(GramJob.java:262)
> at org.globus.cog.abstraction.impl.execution.gt2.JobSubmissionTaskHandler.submitSingleJob(JobSubmissionTaskHandler.java:134)
> ... 4 more
>
> Cleaning up...
> Shutting down service at https://129.114.50.163:45035
>
> snippet of coasters.log:
> 2009-07-20 17:02:02,344-0500 INFO BlockQueueProcessor
> Settings {
> slots = 2
> workersPerNode = 16
> nodeGranularity = 1
> allocationStepSize = 0.1
> maxNodes = 2
> lowOverallocation = 10.0
> highOverallocation = 1.0
> overallocationDecayFactor = 0.0010
> spread = 0.9
> reserve = 10.000s
> maxtime = 86400
> project = TG-CCR080022N
> queue = normal
> remoteMonitorEnabled = false
> }
>
> 2009-07-20 17:02:02,345-0500 INFO BlockQueueProcessor Required size:
> 230400 for 16 jobs
> 2009-07-20 17:02:02,345-0500 INFO BlockQueueProcessor h: 28800, jj:
> 14400, x-last: , r: 1
> 2009-07-20 17:02:02,345-0500 INFO BlockQueueProcessor h: 43200, w: 2,
> size: 230400, msz: 230400, w*h: 86400
> 2009-07-20 17:02:02,355-0500 INFO BlockQueueProcessor Added: 0 - 5
> 2009-07-20 17:02:02,355-0500 INFO Block Starting block: workers=2,
> walltime=43200.000s
> 2009-07-20 17:02:02,358-0500 INFO BlockTaskSubmitter Queuing block
> Block 0720-010553-000000 (2x43200.000s) for submission
> 2009-07-20 17:02:02,359-0500 INFO BlockQueueProcessor Added 6 jobs to
> new blocks
> 2009-07-20 17:02:02,359-0500 INFO BlockQueueProcessor Plan time: 55
> 2009-07-20 17:02:02,359-0500 INFO BlockTaskSubmitter Submitting block
> Block 0720-010553-000000 (2x43200.000s)
> 2009-07-20 17:02:02,379-0500 DEBUG TaskImpl Task(type=JOB_SUBMISSION,
> identity=urn:cog-1248127320448) setting status to Submitting
> 2009-07-20 17:02:02,381-0500 INFO Block Block task status changed: Submitting
> ---end--
>
> with w=2, count = 2 / 16 = 0 when a Block is instantiated.
>
> sites.xml:
> <config>
> <pool handle="RANGER" >
> <execution provider="coaster"
> url="gatekeeper.ranger.tacc.teragrid.org" jobManager="gt2:gt2:SGE"/>
> <profile namespace="globus" key="project">TG-CCR080022N</profile>
> <profile namespace="globus" key="workersPerNode">16</profile>
> <profile namespace="globus" key="queue">normal</profile>
> <profile namespace="karajan" key="initialScore">10000</profile>
> <profile namespace="karajan" key="jobThrottle">0.32</profile>
> <profile namespace="globus" key="slots">2</profile>
> <profile namespace="globus" key="maxNodes">2</profile>
> <profile namespace="globus" key="maxwalltime">4:00:00</profile>
> <profile namespace="globus" key="maxtime">86400</profile>
>
> <filesystem provider="coaster"
> url="gt2://gatekeeper.ranger.tacc.teragrid.org" />
> <workdirectory >/scratch/01035/tg802895/see_runs</workdirectory>
> </pool>
> </config>
>
> obviously i need to get the right mix of overAllocation parameters.
> but an invalid RSL entry should at least be caught.
>
> I'll try to understand better BlockQueueProcessor.allocateBlocks to
> have at least an intelligent guess on what these values should be.
>
>
More information about the Swift-devel
mailing list