hi all, one of our users, anjali (cc'd here) is trying to submit this ~400k job workflow to ranger...thought i'd see if you felt like having a look :)<br><br>log is here: /home/skenny/swift_logs/corr_multisubj-20111018-1321-ihf8hz5g.log<br>
<br>sites file:<br><br><config><br><pool handle="RANGER"><br>     <execution provider="coaster" jobManager="gt2:SGE" url="<a href="http://gatekeeper.ranger.tacc.teragrid.org">gatekeeper.ranger.tacc.teragrid.org</a>"/><br>
     <filesystem provider="gsiftp" url="gsiftp://<a href="http://gridftp.ranger.tacc.teragrid.org">gridftp.ranger.tacc.teragrid.org</a>"/><br>     <profile namespace="globus" key="maxtime">7200</profile><br>
     <profile namespace="globus" key="maxWallTime">00:20:00</profile><br>     <profile namespace="globus" key="jobsPerNode">1</profile><br>     <profile namespace="globus" key="nodeGranularity">64</profile><br>
     <profile namespace="globus" key="maxNodes">256</profile><br>     <profile namespace="globus" key="queue">development</profile><br>     <profile namespace="karajan" key="jobThrottle">1.28</profile><br>
     <profile namespace="globus" key="project">TG-DBS080004N</profile><br>     <profile namespace="globus" key="pe">16way</profile><br>     <profile namespace="karajan" key="initialScore">10000</profile><br>
     <workdirectory>/work/00926/tg459516/swiftwork</workdirectory><br></pool><br></config><br><br><div class="gmail_quote">On Wed, Oct 12, 2011 at 12:13 PM, Mihael Hategan <span dir="ltr"><<a href="mailto:hategan@mcs.anl.gov">hategan@mcs.anl.gov</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;"><div class="im">On Tue, 2011-10-11 at 17:13 -0700, Sarah Kenny wrote:<br>
><br>
><br>
> On Tue, Oct 11, 2011 at 4:23 PM, Mihael Hategan <<a href="mailto:hategan@mcs.anl.gov">hategan@mcs.anl.gov</a>><br>
> wrote:<br>
>         Is this with a persistent coaster service?<br>
><br>
> admittedly i have not used persistent coaster service...should i?<br>
<br>
</div>No. I was just trying to figure out whether it might be something<br>
related to the persistent version.<br>
<div><div></div><div class="h5"><br>
>  i feel like it's documented *somewhere* (?)<br>
><br>
> for now i've tried setting 'sitedir.keep=true' in the config so maybe<br>
> it won't try to run the cleanup job...we'll see (waiting in q)<br>
><br>
><br>
><br>
>         On Tue, 2011-10-11 at 12:05 -0700, Sarah Kenny wrote:<br>
>         ><br>
>         ><br>
>         > On Tue, Oct 11, 2011 at 11:49 AM, David Kelly<br>
>         <<a href="mailto:davidk@ci.uchicago.edu">davidk@ci.uchicago.edu</a>><br>
>         > wrote:<br>
>         ><br>
>         >         That could be it.. maybe a cleanup script is not<br>
>         getting the<br>
>         >         right parameters and failing. Do you happen to have<br>
>         a copy of<br>
>         >         the coaster log?<br>
>         ><br>
>         > just put it in /home/skenny/swift_logs<br>
>         ><br>
>         ><br>
>         >         Maybe there will be some clues in there.<br>
>         ><br>
>         >         ----- Original Message -----<br>
>         >         > From: "Sarah Kenny" <<a href="mailto:skenny@uchicago.edu">skenny@uchicago.edu</a>><br>
>         ><br>
>         >         > To: "David Kelly" <<a href="mailto:davidk@ci.uchicago.edu">davidk@ci.uchicago.edu</a>><br>
>         >         > Cc: "Swift Devel" <<a href="mailto:swift-devel@ci.uchicago.edu">swift-devel@ci.uchicago.edu</a>>,<br>
>         "Swift<br>
>         >         User" <<a href="mailto:swift-user@ci.uchicago.edu">swift-user@ci.uchicago.edu</a>>, "Justin M<br>
>         Wozniak"<br>
>         >         > <<a href="mailto:wozniak@mcs.anl.gov">wozniak@mcs.anl.gov</a>><br>
>         ><br>
>         >         > Sent: Tuesday, October 11, 2011 1:32:37 PM<br>
>         >         > Subject: Re: [Swift-user] gram on ranger<br>
>         ><br>
>         >         > so, this workflow completes all the jobs but then<br>
>         just hangs<br>
>         >         > indefinitely at the end...maybe a stray cleanup<br>
>         job?<br>
>         >         ><br>
>         >         > log is here:<br>
>         >         ><br>
>         ><br>
>         > /home/skenny/swift_logs/corr-20111010-2104-fl5yngd9.log<br>
>         >         ><br>
>         >         > just tweaked the sites file a bit from what david<br>
>         sent me:<br>
>         >         ><br>
>         >         > <config><br>
>         >         > <pool handle="RANGER"><br>
>         >         > <execution provider="coaster" jobManager="gt2:SGE"<br>
>         url="<br>
>         >         > <a href="http://gatekeeper.ranger.tacc.teragrid.org" target="_blank">gatekeeper.ranger.tacc.teragrid.org</a> "/><br>
>         >         > <filesystem provider="gsiftp" url="gsiftp://<br>
>         ><br>
>         >         > <a href="http://gridftp.ranger.tacc.teragrid.org" target="_blank">gridftp.ranger.tacc.teragrid.org</a> "/><br>
>         ><br>
>         >         > <profile namespace="globus"<br>
>         key="maxtime">28800</profile><br>
>         >         > <profile namespace="globus"<br>
>         >         key="maxWallTime">00:15:00</profile><br>
>         >         > <profile namespace="globus"<br>
>         key="jobsPerNode">1</profile><br>
>         >         > <profile namespace="globus"<br>
>         >         key="nodeGranularity">64</profile><br>
>         >         > <profile namespace="globus"<br>
>         key="maxNodes">256</profile><br>
>         >         > <profile namespace="globus"<br>
>         key="queue">normal</profile><br>
>         >         > <profile namespace="karajan"<br>
>         key="jobThrottle">1</profile><br>
>         >         > <profile namespace="globus"<br>
>         >         key="project">TG-DBS080004N</profile><br>
>         >         > <profile namespace="globus"<br>
>         key="pe">16way</profile><br>
>         >         > <profile namespace="karajan"<br>
>         >         key="initialScore">10000</profile><br>
>         >         ><br>
>         ><br>
>         <workdirectory>/work/00043/tg457040/sidgrid_out/skenny</workdirectory><br>
>         >         > </pool><br>
>         >         > </config><br>
>         >         ><br>
>         >         ><br>
>         >         ><br>
>         >         > On Mon, Oct 10, 2011 at 3:43 PM, Sarah Kenny <<br>
>         >         <a href="mailto:skenny@uchicago.edu">skenny@uchicago.edu</a> ><br>
>         >         > wrote:<br>
>         >         ><br>
>         >         ><br>
>         >         > ok, thanks, got in the queue now...also, realized<br>
>         my last<br>
>         >         run may have<br>
>         >         > been using the old swift. apparently i had<br>
>         SWIFT_HOME set in<br>
>         >         my env<br>
>         >         > and that overrides the newer swift i had set in my<br>
>         PATH.<br>
>         >         ><br>
>         >         > ~sk<br>
>         >         ><br>
>         >         ><br>
>         >         ><br>
>         >         > On Mon, Oct 10, 2011 at 12:28 PM, David Kelly <<br>
>         >         <a href="mailto:davidk@ci.uchicago.edu">davidk@ci.uchicago.edu</a><br>
>         >         > > wrote:<br>
>         >         ><br>
>         >         ><br>
>         >         ><br>
>         >         ><br>
>         >         ><br>
>         >         > Sarah,<br>
>         >         ><br>
>         >         > Can you give this another try with the latest<br>
>         0.93? I made<br>
>         >         some<br>
>         >         > changes to the coaster and sge providers and was<br>
>         able to get<br>
>         >         it<br>
>         >         > working with a simple catns script. Here is the<br>
>         >         configuration file I<br>
>         >         > was using:<br>
>         >         ><br>
>         >         > <config><br>
>         >         > <pool handle="ranger"><br>
>         >         > <execution provider="coaster" jobManager="gt2:SGE"<br>
>         url="<br>
>         >         > <a href="http://gatekeeper.ranger.tacc.teragrid.org" target="_blank">gatekeeper.ranger.tacc.teragrid.org</a> "/><br>
>         >         ><br>
>         >         > <filesystem provider="gsiftp" url="gsiftp://<br>
>         ><br>
>         >         > <a href="http://gridftp.ranger.tacc.teragrid.org" target="_blank">gridftp.ranger.tacc.teragrid.org</a> "/><br>
>         ><br>
>         >         > <profile namespace="globus"<br>
>         key="maxtime">3600</profile><br>
>         >         > <profile namespace="globus"<br>
>         >         key="maxWallTime">00:00:03</profile><br>
>         >         > <profile namespace="globus"<br>
>         key="jobsPerNode">1</profile><br>
>         >         > <profile namespace="globus"<br>
>         >         key="nodeGranularity">16</profile><br>
>         >         > <profile namespace="globus"<br>
>         key="maxNodes">16</profile><br>
>         >         > <profile namespace="globus"<br>
>         >         key="queue">development</profile><br>
>         >         > <profile namespace="karajan"<br>
>         key="jobThrottle">0.9</profile><br>
>         >         ><br>
>         >         > <profile namespace="globus"<br>
>         >         key="project">TG-DBS080004N</profile><br>
>         >         ><br>
>         >         > <profile namespace="globus"<br>
>         key="pe">16way</profile><br>
>         >         ><br>
>         ><br>
>         <workdirectory>/share/home/01503/davidkel/swiftwork</workdirectory><br>
>         >         > </pool><br>
>         >         > </config><br>
>         >         ><br>
>         >         > Thanks,<br>
>         >         ><br>
>         >         > David<br>
>         >         ><br>
>         >         > ----- Original Message -----<br>
>         >         ><br>
>         >         > > From: "Sarah Kenny" < <a href="mailto:skenny@uchicago.edu">skenny@uchicago.edu</a> ><br>
>         >         > > To: "Justin M Wozniak" < <a href="mailto:wozniak@mcs.anl.gov">wozniak@mcs.anl.gov</a> ><br>
>         >         > > Cc: "Swift Devel" < <a href="mailto:swift-devel@ci.uchicago.edu">swift-devel@ci.uchicago.edu</a><br>
>         >, "Swift<br>
>         >         User" <<br>
>         >         > > <a href="mailto:swift-user@ci.uchicago.edu">swift-user@ci.uchicago.edu</a> ><br>
>         >         ><br>
>         >         ><br>
>         >         ><br>
>         >         > > Sent: Friday, October 7, 2011 3:13:57 PM<br>
>         >         > > Subject: Re: [Swift-user] gram on ranger<br>
>         >         ><br>
>         > /home/skenny/swift_logs/dummy-20111005-0126-6575n7x5.log<br>
>         >         > ><br>
>         >         > > on ci<br>
>         >         > ><br>
>         >         > ><br>
>         >         > > On Fri, Oct 7, 2011 at 8:16 AM, Justin M Wozniak<br>
>         <<br>
>         >         > > <a href="mailto:wozniak@mcs.anl.gov">wozniak@mcs.anl.gov</a><br>
>         >         > > > wrote:<br>
>         >         > ><br>
>         >         > ><br>
>         >         > ><br>
>         >         > > Can I take a look at the log?<br>
>         >         > ><br>
>         >         > ><br>
>         >         > ><br>
>         >         > ><br>
>         >         > > On Thu, 6 Oct 2011, Sarah Kenny wrote:<br>
>         >         > ><br>
>         >         > ><br>
>         >         > ><br>
>         >         > > hey all, i'm trying to submit to gram on ranger<br>
>         using the<br>
>         >         latest<br>
>         >         > > swift<br>
>         >         > > (built from trunk). it failes like so:<br>
>         >         > ><br>
>         >         > > Cannot submit job<br>
>         >         > > Caused by:<br>
>         >         > > org.globus.cog.abstraction. impl.common.task.<br>
>         >         > > TaskSubmissionException:<br>
>         >         > > Cannot<br>
>         >         > > submit job<br>
>         >         > > Caused by: org.globus.gram.GramException:<br>
>         Parameter not<br>
>         >         supported<br>
>         >         > > Cannot submit job<br>
>         >         > ><br>
>         >         > > the gram log was saying first that 'jobsPerNode'<br>
>         is not<br>
>         >         supported so<br>
>         >         > > i<br>
>         >         > > changed it to workersPerNode and then it was<br>
>         saying<br>
>         >         'maxnodes' is<br>
>         >         > > not<br>
>         >         > > supported. here's my sites file:<br>
>         >         > ><br>
>         >         > > <config><br>
>         >         > > <pool handle="RANGER"><br>
>         >         > > <profile namespace="karajan"<br>
>         key="initialScore">10000</<br>
>         >         profile><br>
>         >         > > <profile namespace="karajan"<br>
>         key="jobThrottle">1</profile><br>
>         >         > > <profile namespace="globus"<br>
>         key="maxWallTime">00:15:00</<br>
>         >         profile><br>
>         >         > > <profile namespace="globus"<br>
>         key="maxTime">86400</profile><br>
>         >         > > <profile namespace="globus"<br>
>         key="slots">1</profile><br>
>         >         > > <profile namespace="globus"<br>
>         key="maxNodes">256</profile><br>
>         >         > > <profile namespace="globus"<br>
>         key="pe">16way</profile><br>
>         >         > > <profile namespace="globus"<br>
>         key="workersPerNode">1</<br>
>         >         profile><br>
>         >         > > <profile namespace="globus"<br>
>         key="nodeGranularity">64</<br>
>         >         profile><br>
>         >         > > <profile namespace="globus"<br>
>         key="queue">normal</profile><br>
>         >         > > <profile namespace="globus"<br>
>         key="project">TG-DBS080004N</<br>
>         >         profile><br>
>         >         > > <filesystem provider="gsiftp" url="gsiftp://<br>
>         >         > > gridftp.ranger.tacc.teragrid. org "/><br>
>         >         ><br>
>         >         > > <execution provider="coaster"<br>
>         jobManager="gt2:gt2:SGE"<br>
>         >         url="<br>
>         >         > > gatekeeper.ranger.tacc. <a href="http://teragrid.org" target="_blank">teragrid.org</a> "/><br>
>         >         ><br>
>         >         > > <execution provider="gt2" jobManager="SGE" url="<br>
>         >         > > gatekeeper.ranger.tacc. <a href="http://teragrid.org" target="_blank">teragrid.org</a> "/><br>
>         >         > > <workdirectory>/work/00043/<br>
>         tg457040</workdirectory><br>
>         >         ><br>
>         >         > > </pool><br>
>         >         > > </config><br>
>         >         > ><br>
>         >         > > thoughts? ideas?<br>
>         >         > ><br>
>         >         > > --<br>
>         >         > > Justin M Wozniak<br>
>         >         > ><br>
>         >         > ><br>
>         >         > ><br>
>         >         > > --<br>
>         >         > > Sarah Kenny<br>
>         >         > > Programmer ~ Brain Circuits Laboratory ~ Rm 2224<br>
>         Bio Sci<br>
>         >         III<br>
>         >         > > University of California Irvine, Dept. of<br>
>         Neurology ~<br>
>         >         773-818-8300<br>
>         >         > ><br>
>         >         > ><br>
>         >         > > _______________________________________________<br>
>         >         > > Swift-user mailing list<br>
>         >         > > <a href="mailto:Swift-user@ci.uchicago.edu">Swift-user@ci.uchicago.edu</a><br>
>         >         > ><br>
>         ><br>
>         <a href="https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user" target="_blank">https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user</a><br>
>         >         ><br>
>         >         ><br>
>         >         ><br>
>         >         ><br>
>         >         ><br>
>         >         ><br>
>         >         > --<br>
>         >         > Sarah Kenny<br>
>         >         > Programmer ~ Brain Circuits Laboratory ~ Rm 2224<br>
>         Bio Sci III<br>
>         >         > University of California Irvine, Dept. of<br>
>         Neurology ~<br>
>         >         773-818-8300<br>
>         >         ><br>
>         >         ><br>
>         >         ><br>
>         >         ><br>
>         >         > --<br>
>         >         > Sarah Kenny<br>
>         >         > Programmer ~ Brain Circuits Laboratory ~ Rm 2224<br>
>         Bio Sci III<br>
>         >         > University of California Irvine, Dept. of<br>
>         Neurology ~<br>
>         >         773-818-8300<br>
>         ><br>
>         ><br>
>         ><br>
>         ><br>
>         > --<br>
>         > Sarah Kenny<br>
>         > Programmer ~ Brain Circuits Laboratory ~ Rm 2224 Bio Sci III<br>
>         > University of California Irvine, Dept. of Neurology ~<br>
>         773-818-8300<br>
>         ><br>
>         > _______________________________________________<br>
>         > Swift-user mailing list<br>
>         > <a href="mailto:Swift-user@ci.uchicago.edu">Swift-user@ci.uchicago.edu</a><br>
>         ><br>
>         <a href="https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user" target="_blank">https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user</a><br>
><br>
><br>
><br>
><br>
><br>
><br>
> --<br>
> Sarah Kenny<br>
> Programmer ~ Brain Circuits Laboratory ~ Rm 2224 Bio Sci III<br>
> University of California Irvine, Dept. of Neurology ~ 773-818-8300<br>
><br>
<br>
<br>
</div></div></blockquote></div><br><br clear="all"><br>-- <br>Sarah Kenny<br>Programmer ~ Brain Circuits Laboratory ~ Rm 2224 Bio Sci III<br>University of California Irvine, Dept. of Neurology ~ 773-818-8300<br><br>