<html><head><style type='text/css'>p { margin: 0; }</style></head><body><div style='font-family: Times New Roman; font-size: 12pt; color: #000000'>Can you try this on PADS using small jobs in the fast queue?<br><br><div>I have not thought this all the way through, but perhaps coasters will honor maxtime and maxwalltime on any coaster block, even if its not running on a batch scheduler. In that case perhaps you can replicate the problem on the MCS pool or better yet on localhost.</div><div><br></div><div>In these runs, what was the value of the execution.retries and lazy.errors flags? Mihael, do those properties need to be set to >0 and true, respectively, in order for coasters to start new blocks correctly, assuming that in some cases a job will run longer than its maxwalltime?</div><div><br></div><div>- Mike</div><div><div><br></div><hr id="zwchr"><blockquote style="border-left:2px solid rgb(16, 16, 255);margin-left:5px;padding-left:5px;"><b>From: </b>"Ketan Maheshwari" <ketancmaheshwari@gmail.com><br><b>To: </b>"Michael Wilde" <wilde@mcs.anl.gov><br><b>Cc: </b>"Papia Rizwan" <papia.rizwan@gmail.com>, "swift-devel Devel" <swift-devel@ci.uchicago.edu><br><b>Sent: </b>Monday, August 22, 2011 10:32:31 AM<br><b>Subject: </b>Re: Blocker issue for 0.93: DSSAT script does not complete, 2nd coaster blocks dont start?<br><br>Mike,<div><br></div><div>If I recall correctly, Papia has always been running her DSSAT app with 0.92. She has not yet tried with 0.93. I too tried with 0.92 with her sites file settings.</div><div><br></div><div>I once tried it with 0.93 on pads but could never get in the running from the queue.</div>
<div><br></div><div>I will give another try today as it might be that PADS was too busy last week. As I recall Jon was also struggling to get access.<br><br></div><div>Regards,</div><div>Ketan</div><div><br><div class="gmail_quote">
On Mon, Aug 22, 2011 at 10:24 AM, Michael Wilde <span dir="ltr"><<a href="mailto:wilde@mcs.anl.gov" target="_blank">wilde@mcs.anl.gov</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">
Papia, Ketan,<br>
<br>
In reviewing 0.93 work remaining with David, I remembered this issue.<br>
<br>
You both reported that the DSSAT application script doesnt finish on PADS - it seems not to start the second round of coaster blocks that it needs to complete (as I recall, but this may not be correct). This needs to be researched and filed as a bug (or, an error in the sites spec needs to be identified and made clear in the site guide if it turns out to be the problem).<br>
<br>
Possible there is an issue with jobs failing at the end of the coaster blocks, and you dont have the necessary retry values set for the PADS site???<br>
<br>
We need an example run with logs and full details. Can you try to re-create this with a much smaller initial allocation, and see if coasters is transitioning from its initial blocks to the next blocks?<br>
<br>
Can you give this high prio for today?<br>
<br>
Thanks,<br>
<br>
- Mike<br>
</blockquote></div><br><br clear="all"><div><br></div>-- <br>Ketan<br><br><br>
</div>
</blockquote><br><span><br><br>-- <br><span name="x"></span>Michael Wilde<br>Computation Institute, University of Chicago<br>Mathematics and Computer Science Division<br>Argonne National Laboratory<br><span name="x"></span><br></span></div></div></body></html>