<html>
<head>
<meta content="text/html; charset=utf-8" http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
Hi Jonathan,<br>
<br>
If your config file is named swift.conf and is in the current
directory, it will be automatically selected and you needn't specify<br>
the file on the commandline, otherwise specify the config file using
the -config option:<br>
swift -config <path_to_config> <your_script.swift><br>
<br>
To resume from the log, say the restart.log in your run001 folder
specify the restart.log using the -resume option:<br>
swift -resume run001/restart.log ...<br>
The restart log is from an 0.95 run, and I'm not quite sure if it
will work correctly with trunk.<br>
<br>
There is no trunk module available on Midway, since we rebuild from
source to keep up to date with changes in the codebase.<br>
<br>
Generally you can always get the latest trunk builds here, (atmost a
week older than last commit):<br>
<a class="moz-txt-link-freetext" href="http://users.rcc.uchicago.edu/~yadunand/swift-trunk-latest.tar.gz">http://users.rcc.uchicago.edu/~yadunand/swift-trunk-latest.tar.gz</a><br>
<br>
Thanks,<br>
Yadu<br>
<br>
<div class="moz-cite-prefix">On 12/04/2014 10:48 AM, Jonathan Ozik
wrote:<br>
</div>
<blockquote
cite="mid:F179939B-E2C7-45FA-9A98-6A0A46E6591E@gmail.com"
type="cite">
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
Thanks Yadu,
<div class=""><br class="">
</div>
<div class="">I have a few questions.</div>
<div class="">- How do I invoke swift and pass it the new
swift.conf?</div>
<div class="">- What is the “restart” procedure?</div>
<div class="">- Is there a module I can load to use the latest
swift trunk?</div>
<div class=""><br class="">
</div>
<div class="">Jonathan</div>
<div class=""><br class="">
<div>
<blockquote type="cite" class="">
<div class="">On Dec 3, 2014, at 7:03 PM, Yadu Nand Babuji
<<a moz-do-not-send="true"
href="mailto:yadunand@uchicago.edu" class="">yadunand@uchicago.edu</a>>
wrote:</div>
<br class="Apple-interchange-newline">
<div class="">
<div bgcolor="#FFFFFF" text="#000000" class=""> Hi
Jonathan,<br class="">
<br class="">
I believe some of the issues related to timeouts seen in
your logs are fixed/less likely in trunk<br class="">
and would recommend that you try a run with that. I've
also converted your swift.properties to<br class="">
the new swift.conf format. You can get a tested .conf
file along with a small test case from here:<br class="">
<br class="">
<a moz-do-not-send="true" class="moz-txt-link-freetext"
href="http://users.rcc.uchicago.edu/%7Eyadunand/test_configs_package.tar.gz">http://users.rcc.uchicago.edu/~yadunand/test_configs_package.tar.gz</a><br
class="">
<br class="">
Here are some changes I've made to the conf:<br class="">
lazyErrors: true and executionRetries: 0 so that long
running jobs are not retried.<br class="">
staging set to direct, since you are running on the
shared FS.<br class="">
added worker logging and an app definition for debug.<br
class="">
<br class="">
You can get the latest trunk build from here : <a
moz-do-not-send="true" class="moz-txt-link-freetext"
href="http://users.rcc.uchicago.edu/%7Eyadunand/swift-trunk-latest.tar.gz">http://users.rcc.uchicago.edu/~yadunand/swift-trunk-latest.tar.gz</a><br
class="">
<br class="">
Thanks,<br class="">
Yadu<br class="">
<br class="">
<div class="moz-cite-prefix">On 12/03/2014 01:16 PM,
Jonathan Ozik wrote:<br class="">
</div>
<blockquote
cite="mid:040074E2-ADC1-45C0-8580-9926B8E64535@gmail.com"
type="cite" class="">
<div class="" style="word-wrap:break-word">Hi Yadu,
<div class=""><br class="">
</div>
<div class="">The tar.gz archive is here: <a
moz-do-not-send="true"
href="https://www.dropbox.com/s/tt3ewapzaf0ygac/run001.tar.gz?dl=0"
class="">https://www.dropbox.com/s/tt3ewapzaf0ygac/run001.tar.gz?dl=0</a></div>
<div class="">I’m also attaching the
swift.properties file that I used below.</div>
<div class=""><br class="">
</div>
<div class="">Thank you,</div>
<div class=""><br class="">
</div>
<div class="">Jonathan</div>
</div>
<div class="" style="word-wrap:break-word">
<div class=""><br class="">
<div class="">
<blockquote type="cite" class="">
<div class="">On Dec 3, 2014, at 11:04 AM,
Yadu Nand Babuji <<a
moz-do-not-send="true"
href="mailto:yadunand@uchicago.edu"
class="">yadunand@uchicago.edu</a>>
wrote:</div>
<br class="x_Apple-interchange-newline">
<div class="">Hi Jonathan,<br class="">
<br class="">
The issue you are seeing sounds pretty close
to what David reported a <br class="">
while back.<br class="">
Could you send us a tar ball of your run
directory from a failed run ?<br class="">
<br class="">
Could you also check if you've set
lowOverAllocation and <br class="">
highOverAllocation in your sites definition
?<br class="">
<br class="">
Thanks,<br class="">
Yadu<br class="">
<br class="">
On 12/03/2014 10:50 AM, Ozik, Jonathan
wrote:<br class="">
<blockquote type="cite" class="">Hi all,<br
class="">
<br class="">
I’m trying to run a large set of
simulations on Midway using Swift
0.95-RC5.<br class="">
768 of the 2187 tasks completed
successfully and then I got the exception:<br
class="">
<br class="">
<span class="x_Apple-tab-span"
style="white-space:pre"></span>exception
@ swift-int.k, line: 530<br class="">
Caused by: Block task failed: Connection
to worker lost<br class="">
org.globus.cog.coaster.TimeoutException:
Channel timed out.
lastTime=141203-145449.325,
now=141203-145649.844, channel=TCPChannel
[type: server, contact:
1202-5410010-000072-000000]<br class="">
<span class="x_Apple-tab-span"
style="white-space:pre"></span>at
org.globus.cog.coaster.channels.AbstractCoasterChannel.checkTimeouts(AbstractCoasterChannel.java:133)<br
class="">
<span class="x_Apple-tab-span"
style="white-space:pre"></span>at
org.globus.cog.coaster.channels.AbstractCoasterChannel$1.run(AbstractCoasterChannel.java:124)<br
class="">
<span class="x_Apple-tab-span"
style="white-space:pre"></span>at
java.util.TimerThread.mainLoop(Timer.java:555)<br
class="">
<span class="x_Apple-tab-span"
style="white-space:pre"></span>at
java.util.TimerThread.run(Timer.java:505)<br
class="">
<br class="">
Progress: Wed, 03 Dec 2014 14:59:51+0000
Submitted:651 Failed:6 Finished
successfully:768 Failed but can retry:762<br
class="">
Progress: Wed, 03 Dec 2014 14:59:52+0000
Submitted:651 Failed:44 Finished
successfully:768 Failed but can retry:724<br
class="">
<br class="">
And the process seems to have stopped.<br
class="">
<br class="">
What log file would be helpful for
diagnosing this?<br class="">
<br class="">
Jonathan<br class="">
<br class="">
<br class="">
_______________________________________________<br class="">
Swift-user mailing list<br class="">
<a moz-do-not-send="true"
href="mailto:Swift-user@ci.uchicago.edu"
class="">Swift-user@ci.uchicago.edu</a><br
class="">
<a moz-do-not-send="true"
class="moz-txt-link-freetext"
href="https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user">https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user</a><br
class="">
</blockquote>
<br class="">
_______________________________________________<br class="">
Swift-user mailing list<br class="">
<a moz-do-not-send="true"
href="mailto:Swift-user@ci.uchicago.edu"
class="">Swift-user@ci.uchicago.edu</a><br
class="">
<a moz-do-not-send="true"
class="moz-txt-link-freetext"
href="https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user">https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user</a></div>
</blockquote>
</div>
<br class="">
</div>
</div>
</blockquote>
<br class="">
</div>
</div>
</blockquote>
</div>
<br class="">
</div>
</blockquote>
<br>
</body>
</html>