yr right ketan, if i change it to: <profile namespace="globus" key="jobsPerNode">16</profile> the warning message goes away. however, there are times i don't want to run 16 jobs per node...e.g. bcs a single job needs all the available memory so even though the node has 16 processors i can't actually use them all. so perhaps this is just a scheduling issue with ranger/sge in that they don't want you to submit a job that's going to leave processors idle? that seems a bit restrictive though...<br>
<br><div class="gmail_quote">On Wed, Dec 21, 2011 at 7:58 AM, Ketan Maheshwari <span dir="ltr"><<a href="mailto:ketancmaheshwari@gmail.com">ketancmaheshwari@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Sarah,<br><br>I checked my sites.xml. The only difference between yours and mine being the value of jobspernode which is 16 in my case. I have had this value in other multiples of 16 which has worked fine for me.<br><br>
<br>
<div class="gmail_quote"><div><div></div><div class="h5">On Wed, Dec 21, 2011 at 6:57 AM, Sarah Kenny <span dir="ltr"><<a href="mailto:skenny@uci.edu" target="_blank">skenny@uci.edu</a>></span> wrote:<br></div></div>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div><div></div><div class="h5">
getting this when submitting to ranger with both the latest and our previous version of swift (swift-r5259 cog-r3313)<br><br>Final status: time: Wed, 21 Dec 2011 04:49:15 -0800 Finished successfully:100<br>The following warnings have occurred:<br>
1. org.globus.cog.abstraction.impl.common.task.TaskSubmissionException: Cannot submit job: Could not submit job (qsub reported an exit code of 1). -------------------------------------------------------------------------- Welcome to TACC's Ranger System, an NSF XD Resource ----------------------------------------------------------------------------> Checking that you specified -V...--> Checking that you specified a time limit...--> Checking that you specified a queue...--> Setting project...--> Checking that you specified a parallel environment...--> Checking that you specified a valid parallel environment name...--> Checking that the minimum and maximum PE counts are the same...--> Checking that the number of PEs requested is valid...------------------> Rejecting job <------------------Your slot (or core) request is not a multiple of 16.Syntax: -pe <pe_name> <n>where <n> is a multiple of 16.-----------------------------------------------------<br>
Unable to run job: JSV rejected job.Exiting.<br><br> at org.globus.cog.abstraction.impl.scheduler.common.AbstractJobSubmissionTaskHandler.submit(AbstractJobSubmissionTaskHandler.java:63)<br> at org.globus.cog.abstraction.impl.common.AbstractTaskHandler.submit(AbstractTaskHandler.java:45)<br>
at org.globus.cog.abstraction.impl.common.task.ExecutionTaskHandler.submit(ExecutionTaskHandler.java:57)<br> at org.globus.cog.abstraction.coaster.service.job.manager.LocalQueueProcessor.run(LocalQueueProcessor.java:40)<br>
Caused by: org.globus.cog.abstraction.impl.scheduler.common.ProcessException: Could not submit job (qsub reported an exit code of 1). -------------------------------------------------------------------------- Welcome to TACC's Ranger System, an NSF XD Resource ----------------------------------------------------------------------------> Checking that you specified -V...--> Checking that you specified a time limit...--> Checking that you specified a queue...--> Setting project...--> Checking that you specified a parallel environment...--> Checking that you specified a valid parallel environment name...--> Checking that the minimum and maximum PE counts are the same...--> Checking that the number of PEs requested is valid...------------------> Rejecting job <------------------Your slot (or core) request is not a multiple of 16.Syntax: -pe <pe_name> <n>where <n> is a multiple of 16.-----------------------------------------------------<br>
Unable to run job: JSV rejected job.Exiting.<br><br> at org.globus.cog.abstraction.impl.scheduler.common.AbstractExecutor.start(AbstractExecutor.java:108)<br> at org.globus.cog.abstraction.impl.scheduler.common.AbstractJobSubmissionTaskHandler.submit(AbstractJobSubmissionTaskHandler.java:53)<br>
... 3 more<br><br>################### sites file<br><br><config><br><pool handle="RANGER"><br> <execution provider="coaster" jobManager="gt2:SGE" url="<a href="http://gatekeeper.ranger.tacc.teragrid.org" target="_blank">gatekeeper.ranger.tacc.teragrid.org</a>"/><br>
<filesystem provider="gsiftp" url="gsiftp://<a href="http://gridftp.ranger.tacc.teragrid.org" target="_blank">gridftp.ranger.tacc.teragrid.org</a>"/><br> <profile namespace="globus" key="maxtime">86400</profile><br>
<profile namespace="globus" key="maxWallTime">02:00:00</profile><br> <profile namespace="globus" key="jobsPerNode">1</profile><br> <profile namespace="globus" key="nodeGranularity">64</profile><br>
<profile namespace="globus" key="maxNodes">4096</profile><br> <profile namespace="globus" key="queue">normal</profile><br> <profile namespace="karajan" key="jobThrottle">1.28</profile><br>
<profile namespace="globus" key="project">TG-DBS080004N</profile><br> <profile namespace="globus" key="pe">16way</profile><br> <profile namespace="karajan" key="initialScore">10000</profile><br>
<workdirectory>/work/00043/tg457040/swiftwork</workdirectory><br></pool><br></config><br><br>same settings we've been using for a while, i'm not sure why this seems to be popping up now, but it's rather consistent. all jobs are finishing successfully, so it's rather confusing...any idea what i might be missing here? <br>
<br>thanks<br><font color="#888888">~sk<br><br><br> <br><br>
</font><br></div></div><div class="im">_______________________________________________<br>
Swift-devel mailing list<br>
<a href="mailto:Swift-devel@ci.uchicago.edu" target="_blank">Swift-devel@ci.uchicago.edu</a><br>
<a href="https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel" target="_blank">https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel</a><br>
<br></div></blockquote></div><font color="#888888"><br><br clear="all"><br>-- <br>Ketan<br><br><br>
</font><br>_______________________________________________<br>
Swift-user mailing list<br>
<a href="mailto:Swift-user@ci.uchicago.edu">Swift-user@ci.uchicago.edu</a><br>
<a href="https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user" target="_blank">https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user</a><br></blockquote></div><br><br clear="all"><br>-- <br>Sarah Kenny<br>
Programmer ~ Brain Circuits Laboratory ~ Rm 2224 Bio Sci III<br>University of California Irvine, Dept. of Neurology ~ 773-818-8300<br><br>