<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class="">
I’ve looked a bit closer into the differences between the different staging options, and chose the “local” option for now, even though this is probably not the most efficient in terms of creating unnecessarily large amounts of copies of the input files needed
for each app invocation.
<div class="">Speaking of which, in the User Guide (<a href="http://swift-lang.org/guides/trunk/userguide/userguide.html" class="">http://swift-lang.org/guides/trunk/userguide/userguide.html</a>), there is a section that states “The wrapper script creates the
application workspace directory; places the input files for that job into the application workspace directory
<b class="">using either cp or ln -s</b> (depending on a configuration option)…,” but I couldn’t find any more information on enabling the symlinking of input files. Is this associated with a specific type of staging or configuration?</div>
<div class=""><br class="">
</div>
<div class="">Jonathan</div>
<div class=""><br class="">
</div>
<div class="">
<div>
<blockquote type="cite" class="">
<div class="">On Dec 4, 2014, at 5:57 PM, Ozik, Jonathan <<a href="mailto:jozik@anl.gov" class="">jozik@anl.gov</a>> wrote:</div>
<br class="Apple-interchange-newline">
<div class="">The "staging: direct” option that’s included in the swift.conf file Yadu provided, I don’t seem to see a definition for it in the user guide. I’m having a path name issue and I suspect it could be something to do with the staging, but I’m not
sure.<br class="">
<br class="">
If I use a “-upf=filename.txt” command line argument to a swift script that includes the lines:<br class="">
<br class="">
string upf_str = @arg("upf","unrolledParamFile.txt");<br class="">
file params_file <single_file_mapper;file=upf_str>;<br class="">
<br class="">
If I use the filename(params_file) command, would I get “filename.txt” with the default staging and the full path of the filename.txt file with the “direct” staging? Or is this a change between 0.95 RC5 and trunk?<br class="">
<br class="">
Jonathan<br class="">
<br class="">
<blockquote type="cite" class="">On Dec 4, 2014, at 2:58 PM, Ozik, Jonathan <<a href="mailto:jozik@anl.gov" class="">jozik@anl.gov</a>> wrote:<br class="">
<br class="">
Thank you all,<br class="">
<br class="">
The job is queued up now. I’ll update on the results.<br class="">
<br class="">
Jonathan<br class="">
<br class="">
<blockquote type="cite" class="">On Dec 4, 2014, at 1:33 PM, Michael Wilde <<a href="mailto:wilde@anl.gov" class="">wilde@anl.gov</a>> wrote:<br class="">
<br class="">
We should (and will) add a getcwd( ) library function to eliminate this <br class="">
particular need for java( ), though.<br class="">
<br class="">
- Mike<br class="">
<br class="">
<br class="">
On 12/4/14 1:23 PM, Yadu Nand Babuji wrote:<br class="">
<blockquote type="cite" class="">Hi Jonathan,<br class="">
<br class="">
I rebuilt the trunk package with Mihael's fixes, and you can get it from<br class="">
here :<br class="">
<a href="http://users.rcc.uchicago.edu/~yadunand/swift-trunk-latest.tar.gz" class="">http://users.rcc.uchicago.edu/~yadunand/swift-trunk-latest.tar.gz</a><br class="">
<br class="">
-Yadu<br class="">
<br class="">
On 12/04/2014 01:01 PM, Mihael Hategan wrote:<br class="">
<blockquote type="cite" class="">Hi Jonathan,<br class="">
<br class="">
I fixed this in GIT. Yadu, can you compile the latest GIT please?<br class="">
<br class="">
Mihael<br class="">
<br class="">
On Thu, 2014-12-04 at 18:33 +0000, Ozik, Jonathan wrote:<br class="">
<blockquote type="cite" class="">Hi Yadu,<br class="">
<br class="">
I’ve tried running with trunk and am getting a strange Java error this time:<br class="">
No method: getProperty in java.lang.System with parameter types[class java.lang.String]<br class="">
swiftscript:java @ repast, line: 267<br class="">
<br class="">
at org.griphyn.vdl.karajan.lib.swiftscript.Java.getMethod(Java.java:192)<br class="">
at org.griphyn.vdl.karajan.lib.swiftscript.Java.function(Java.java:162)<br class="">
at org.griphyn.vdl.karajan.lib.SwiftFunction.runBody(SwiftFunction.java:77)<br class="">
at org.globus.cog.karajan.compiled.nodes.InternalFunction.run(InternalFunction.java:175)<br class="">
at org.globus.cog.karajan.compiled.nodes.CompoundNode.runChild(CompoundNode.java:110)<br class="">
at org.globus.cog.karajan.compiled.nodes.InternalFunction.run(InternalFunction.java:165)<br class="">
at org.globus.cog.karajan.compiled.nodes.CompoundNode.runChild(CompoundNode.java:110)<br class="">
at org.globus.cog.karajan.compiled.nodes.InternalFunction.run(InternalFunction.java:165)<br class="">
at org.globus.cog.karajan.compiled.nodes.CompoundNode.runChild(CompoundNode.java:110)<br class="">
at org.globus.cog.karajan.compiled.nodes.Sequential.run(Sequential.java:41)<br class="">
at org.globus.cog.karajan.compiled.nodes.CompoundNode.runChild(CompoundNode.java:110)<br class="">
at org.globus.cog.karajan.compiled.nodes.UParallel$1.run(UParallel.java:91)<br class="">
at k.thr.LWThread.run(LWThread.java:247)<br class="">
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)<br class="">
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)<br class="">
at java.lang.Thread.run(Thread.java:745)<br class="">
<br class="">
Execution failed:<br class="">
Error attempting to use: java.lang.System<br class="">
swiftscript:java @ repast, line: 267<br class="">
<br class="">
I think this is being triggered by the call:<br class="">
string s = strcat(java("java.lang.System","getProperty","user.dir"),"/“);<br class="">
<br class="">
Which worked just fine with 0.95 RC5.<br class="">
<br class="">
Any thoughts?<br class="">
<br class="">
Jonathan<br class="">
<br class="">
On Dec 4, 2014, at 11:14 AM, Yadu Nand Babuji <yadunand@uchicago.edu<mailto:yadunand@uchicago.edu>> wrote:<br class="">
<br class="">
Hi Jonathan,<br class="">
<br class="">
If your config file is named swift.conf and is in the current directory, it will be automatically selected and you needn't specify<br class="">
the file on the commandline, otherwise specify the config file using the -config option:<br class="">
swift -config <path_to_config> <your_script.swift><br class="">
<br class="">
To resume from the log, say the restart.log in your run001 folder specify the restart.log using the -resume option:<br class="">
swift -resume run001/restart.log ...<br class="">
The restart log is from an 0.95 run, and I'm not quite sure if it will work correctly with trunk.<br class="">
<br class="">
There is no trunk module available on Midway, since we rebuild from source to keep up to date with changes in the codebase.<br class="">
<br class="">
Generally you can always get the latest trunk builds here, (atmost a week older than last commit):<br class="">
http://users.rcc.uchicago.edu/~yadunand/swift-trunk-latest.tar.gz<br class="">
<br class="">
Thanks,<br class="">
Yadu<br class="">
<br class="">
On 12/04/2014 10:48 AM, Jonathan Ozik wrote:<br class="">
Thanks Yadu,<br class="">
<br class="">
I have a few questions.<br class="">
- How do I invoke swift and pass it the new swift.conf?<br class="">
- What is the “restart” procedure?<br class="">
- Is there a module I can load to use the latest swift trunk?<br class="">
<br class="">
Jonathan<br class="">
<br class="">
On Dec 3, 2014, at 7:03 PM, Yadu Nand Babuji <yadunand@uchicago.edu<mailto:yadunand@uchicago.edu>> wrote:<br class="">
<br class="">
Hi Jonathan,<br class="">
<br class="">
I believe some of the issues related to timeouts seen in your logs are fixed/less likely in trunk<br class="">
and would recommend that you try a run with that. I've also converted your swift.properties to<br class="">
the new swift.conf format. You can get a tested .conf file along with a small test case from here:<br class="">
<br class="">
http://users.rcc.uchicago.edu/~yadunand/test_configs_package.tar.gz<http://users.rcc.uchicago.edu/%7Eyadunand/test_configs_package.tar.gz><br class="">
<br class="">
Here are some changes I've made to the conf:<br class="">
lazyErrors: true and executionRetries: 0 so that long running jobs are not retried.<br class="">
staging set to direct, since you are running on the shared FS.<br class="">
added worker logging and an app definition for debug.<br class="">
<br class="">
You can get the latest trunk build from here : http://users.rcc.uchicago.edu/~yadunand/swift-trunk-latest.tar.gz<http://users.rcc.uchicago.edu/%7Eyadunand/swift-trunk-latest.tar.gz><br class="">
<br class="">
Thanks,<br class="">
Yadu<br class="">
<br class="">
On 12/03/2014 01:16 PM, Jonathan Ozik wrote:<br class="">
Hi Yadu,<br class="">
<br class="">
The tar.gz archive is here: https://www.dropbox.com/s/tt3ewapzaf0ygac/run001.tar.gz?dl=0<br class="">
I’m also attaching the swift.properties file that I used below.<br class="">
<br class="">
Thank you,<br class="">
<br class="">
Jonathan<br class="">
<br class="">
On Dec 3, 2014, at 11:04 AM, Yadu Nand Babuji <yadunand@uchicago.edu<mailto:yadunand@uchicago.edu>> wrote:<br class="">
<br class="">
Hi Jonathan,<br class="">
<br class="">
The issue you are seeing sounds pretty close to what David reported a<br class="">
while back.<br class="">
Could you send us a tar ball of your run directory from a failed run ?<br class="">
<br class="">
Could you also check if you've set lowOverAllocation and<br class="">
highOverAllocation in your sites definition ?<br class="">
<br class="">
Thanks,<br class="">
Yadu<br class="">
<br class="">
On 12/03/2014 10:50 AM, Ozik, Jonathan wrote:<br class="">
Hi all,<br class="">
<br class="">
I’m trying to run a large set of simulations on Midway using Swift 0.95-RC5.<br class="">
768 of the 2187 tasks completed successfully and then I got the exception:<br class="">
<br class="">
exception @ swift-int.k, line: 530<br class="">
Caused by: Block task failed: Connection to worker lost<br class="">
org.globus.cog.coaster.TimeoutException: Channel timed out. lastTime=141203-145449.325, now=141203-145649.844, channel=TCPChannel [type: server, contact: 1202-5410010-000072-000000]<br class="">
at org.globus.cog.coaster.channels.AbstractCoasterChannel.checkTimeouts(AbstractCoasterChannel.java:133)<br class="">
at org.globus.cog.coaster.channels.AbstractCoasterChannel$1.run(AbstractCoasterChannel.java:124)<br class="">
at java.util.TimerThread.mainLoop(Timer.java:555)<br class="">
at java.util.TimerThread.run(Timer.java:505)<br class="">
<br class="">
Progress: Wed, 03 Dec 2014 14:59:51+0000 Submitted:651 Failed:6 Finished successfully:768 Failed but can retry:762<br class="">
Progress: Wed, 03 Dec 2014 14:59:52+0000 Submitted:651 Failed:44 Finished successfully:768 Failed but can retry:724<br class="">
<br class="">
And the process seems to have stopped.<br class="">
<br class="">
What log file would be helpful for diagnosing this?<br class="">
<br class="">
Jonathan<br class="">
<br class="">
<br class="">
_______________________________________________<br class="">
Swift-user mailing list<br class="">
Swift-user@ci.uchicago.edu<mailto:Swift-user@ci.uchicago.edu><br class="">
https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user<br class="">
<br class="">
_______________________________________________<br class="">
Swift-user mailing list<br class="">
Swift-user@ci.uchicago.edu<mailto:Swift-user@ci.uchicago.edu><br class="">
https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user<br class="">
<br class="">
<br class="">
<br class="">
<br class="">
<br class="">
_______________________________________________<br class="">
Swift-user mailing list<br class="">
Swift-user@ci.uchicago.edu<br class="">
https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user<br class="">
</blockquote>
</blockquote>
_______________________________________________<br class="">
Swift-user mailing list<br class="">
Swift-user@ci.uchicago.edu<br class="">
https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user<br class="">
</blockquote>
<br class="">
-- <br class="">
Michael Wilde<br class="">
Mathematics and Computer Science Computation Institute<br class="">
Argonne National Laboratory The University of Chicago<br class="">
<br class="">
_______________________________________________<br class="">
Swift-user mailing list<br class="">
<a href="mailto:Swift-user@ci.uchicago.edu" class="">Swift-user@ci.uchicago.edu</a><br class="">
https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user<br class="">
</blockquote>
<br class="">
_______________________________________________<br class="">
Swift-user mailing list<br class="">
<a href="mailto:Swift-user@ci.uchicago.edu" class="">Swift-user@ci.uchicago.edu</a><br class="">
https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user<br class="">
</blockquote>
<br class="">
_______________________________________________<br class="">
Swift-user mailing list<br class="">
<a href="mailto:Swift-user@ci.uchicago.edu" class="">Swift-user@ci.uchicago.edu</a><br class="">
https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user</div>
</blockquote>
</div>
<br class="">
</div>
</body>
</html>