Emalayan,<div><br></div><div>I would check all the mappers and the resulting paths in the Swift source. </div><div><br></div><div>Also try running the failed job something like this: </div><div><br></div><div>cd <swift.workdir>/<span style="font-family:'times new roman','new york',times,serif;font-size:16px;font-style:italic;background-color:rgb(255,255,255)">SwiftMontage-20120119-1749-</span><span style="font-family:'times new roman','new york',times,serif;font-size:16px;font-style:italic;background-color:rgb(255,255,255)">rjshh1r9/jobs/b/mConcatFit-</span><span style="font-family:'times new roman','new york',times,serif;font-size:16px;font-style:italic;background-color:rgb(255,255,255)">b1sa4vlk</span></div>
<div><font face="'times new roman', 'new york', times, serif" size="3"><i><br></i></font></div><div><span style="font-family:'times new roman','new york',times,serif;font-size:16px;font-style:italic;background-color:rgb(255,255,255)">mConcatFit </span><span style="background-color:rgb(255,255,255);font-family:'times new roman','new york',times,serif;font-size:16px;font-style:italic">_concurrent/status_tbl-</span><span style="background-color:rgb(255,255,255);font-family:'times new roman','new york',times,serif;font-size:16px;font-style:italic">7a8340c2-045d-4039-a77c-</span><span style="background-color:rgb(255,255,255);font-family:'times new roman','new york',times,serif;font-size:16px;font-style:italic">00429b78d9c9-5 fits.tbl stat_dir</span></div>
<div><br></div><div>error 520 indicates workers are not able to reach the data.</div><div><br></div><div>Also check if swift.workdir is writable on the site by the worker nodes.</div><div><br><div class="gmail_quote">On Thu, Jan 19, 2012 at 7:55 PM, Emalayan Vairavanathan <span dir="ltr"><<a href="mailto:svemalayan@yahoo.com">svemalayan@yahoo.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div><div style="color:#000;background-color:#fff;font-family:times new roman,new york,times,serif;font-size:12pt"><div>
<span>Hi Ketan,</span></div><div><br><span></span></div><div><span>This was with </span><span style="font-weight:bold">swift-0.92.1.</span><span><span> Now I have downloaded the latest swift 0.93 and </span>getting totally different error messages with swift 0.93. I can ask Jon about these messages. (These scripts was working well with only Swift)<br>
</span></div><div><br><span></span></div><div><span>Please let me know if you have any idea. <br></span></div><div><br><span></span></div><div><span>Regards</span></div><div><span>Emalayan<br></span></div><div><br><span></span></div>
<div><span>===============================================================================================<br></span></div><div><span><span style="font-style:italic">Swift 0.93 swift-r5501 cog-r3350</span><br style="font-style:italic">
<br style="font-style:italic"><span style="font-style:italic">RunID: 20120119-1749-rjshh1r9</span><br style="font-style:italic"><span style="font-style:italic"> (input): found 10 files</span><br style="font-style:italic">
<span style="font-style:italic">Progress:  time: Thu, 19 Jan 2012 17:49:20 -0800</span><br style="font-style:italic"><span style="font-style:italic">Find: <a href="http://localhost:1984" target="_blank">http://localhost:1984</a></span><br style="font-style:italic">
<span style="font-style:italic">Find:  keepalive(120), reconnect - <a href="http://localhost:1984" target="_blank">http://localhost:1984</a></span><br style="font-style:italic"><span style="font-style:italic">Progress:  time: Thu, 19 Jan 2012 17:49:22 -0800  Stage in:1  Submitted:9</span><br style="font-style:italic">
<span style="font-style:italic">Progress:  time: Thu, 19 Jan 2012 17:49:25 -0800  Active:9  Stage out:1</span><br style="font-style:italic"><span style="font-style:italic">Progress:  time:
 Thu, 19 Jan 2012 17:49:26 -0800  Stage out:3  Finished successfully:7</span><br style="font-style:italic"><span style="font-style:italic">Progress:  time: Thu, 19 Jan 2012 17:49:28 -0800  Active:1  Finished successfully:10</span><br style="font-style:italic">
<span style="font-style:italic">Progress:  time: Thu, 19 Jan 2012 17:49:29 -0800  Stage in:1  Submitting:11  Submitted:6  Finished successfully:12</span><br style="font-style:italic"><span style="font-style:italic">Progress:  time: Thu, 19 Jan 2012 17:49:30 -0800  Stage in:4  Submitted:1  Active:6  Stage out:2  Finished successfully:17</span><br style="font-style:italic">
<span style="font-style:italic">Progress:  time: Thu, 19 Jan 2012 17:49:31 -0800  Active:1  Finished successfully:30</span><br style="font-style:italic"><span style="font-style:italic">Exception in mConcatFit:</span><br style="font-style:italic">
<span style="font-style:italic">Arguments: [_concurrent/status_tbl-7a8340c2-045d-4039-a77c-00429b78d9c9-5, fits.tbl, stat_dir]</span><br style="font-style:italic"><span style="font-style:italic">Host: localhost</span><br style="font-style:italic">
<span style="font-style:italic">Directory: SwiftMontage-20120119-1749-rjshh1r9/jobs/b/mConcatFit-b1sa4vlk</span><br style="font-style:italic"><span style="font-style:italic">- - -</span><br style="font-style:italic"><br style="font-style:italic">
<span style="font-style:italic">Caused by: null</span><br style="font-style:italic"><span style="font-style:italic">Caused by: org.globus.cog.abstraction.impl.common.execution.JobException: Job failed with an exit code of 520</span><br style="font-style:italic">
<span style="font-style:italic">Execution failed:</span><br style="font-style:italic"><span style="font-style:italic">    back_list:Table
 = org.griphyn.vdl.mapping.DataDependentException - Closed not derived due to errors in data dependencies</span><br></span></div><div class="hm"><div><br></div>  </div><div style="font-family:times new roman,new york,times,serif;font-size:12pt">
<div class="hm"> </div><div style="font-family:times new roman,new york,times,serif;font-size:12pt"><div class="hm"> <div dir="ltr"> <font face="Arial"> <hr size="1">  <b><span style="font-weight:bold">From:</span></b> Ketan Maheshwari <<a href="mailto:ketancmaheshwari@gmail.com" target="_blank">ketancmaheshwari@gmail.com</a>><br>
 <b><span style="font-weight:bold">To:</span></b> Emalayan Vairavanathan <<a href="mailto:svemalayan@yahoo.com" target="_blank">svemalayan@yahoo.com</a>> <br><b><span style="font-weight:bold">Cc:</span></b> swift user <<a href="mailto:swift-user@ci.uchicago.edu" target="_blank">swift-user@ci.uchicago.edu</a>> <br>
 <b><span style="font-weight:bold">Sent:</span></b> Thursday, 19 January 2012 4:49 PM<br> <b><span style="font-weight:bold">Subject:</span></b> Re: [Swift-user] Montage+Swift+Coasters<br> </font> </div></div><div><div></div>
<div class="h5"> <br><div>Emalayan,<div><br></div><div>From your
 symptoms, it seems you are facing the same issue as I've been. Could you tell more about the amount of data that needs to be staged to run the Montage stages during which these warnings turn up? How much time elapses since the start of your workflow after which you see these messages?<br>

<br>Also, what version of Swift is this?</div><div><br></div><div>Regards,</div><div>Ketan</div><div><br><div>On Thu, Jan 19, 2012 at 5:51 PM, Emalayan Vairavanathan <span dir="ltr"><<a rel="nofollow" href="mailto:svemalayan@yahoo.com" target="_blank">svemalayan@yahoo.com</a>></span> wrote:<br>

<blockquote style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div><div style="color:#000;background-color:#fff;font-family:times new roman,new york,times,serif;font-size:12pt"><div>
<span>Dear All,</span></div>

<div><br>
  <span></span></div>

<div><span>I have a problem in running Montage with Coasters (<span style="font-style:italic">in our local cluster - no batch schedulers</span>). After few stages the swift run-time continuously prints the warnings below. Any ideas ? Should I increase the heartbeat count ?<br>

</span></div><div><span><br></span></div><div><span>Everything works fine when I try to run the same montage-scripts with swift on a single machine.<br></span></div><div><br><span></span></div><div><span>Thank you</span></div>

<div><span>Emalayan<br></span></div><div><span><br>
  </span></div>

<div><br>
  <span></span></div>

<div style="font-style:italic"><span>2012-01-19
 15:38:09,207-0800 WARN  Command Command(119, HEARTBEAT): handling reply
 timeout; sendReqTime=120119-153609.206, sendTime=120119-153609.206, 
now=120119-153809.207<br>
<a rel="nofollow">2012-01-19 15</a>:38:09,207-0800 INFO  Command Command(119, HEARTBEAT): re-sending<br>
<a rel="nofollow">2012-01-19 15</a>:38:09,209-0800 WARN  Command Command(119, HEARTBEAT)fault was: Reply timeout<br>
org.globus.cog.karajan.workflow.service.ReplyTimeoutException<br>
        at org.globus.cog.karajan.workflow.service.commands.Command.handleReplyTimeout(Command.java:288)<br>
        at org.globus.cog.karajan.workflow.service.commands.Command$Timeout.run(Command.java:293)<br>
        at java.util.TimerThread.mainLoop(Timer.java:534)<br>
        at java.util.TimerThread.run(Timer.java:484)</span></div>
</div></div><br>_______________________________________________<br>
Swift-user mailing list<br>
<a rel="nofollow" href="mailto:Swift-user@ci.uchicago.edu" target="_blank">Swift-user@ci.uchicago.edu</a><br>
<a rel="nofollow" href="https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user" target="_blank">https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user</a><br></blockquote></div><br><br clear="all"><div>
<br></div>-- <br>
Ketan<br><br><br>
</div>
</div><br><br> </div></div></div> </div>  </div></div></blockquote></div><br><br clear="all"><div><br></div>-- <br>Ketan<br><br><br>
</div>