<p>Maybe try increasing the time in the .timeout file? I usually see something similar when the job exceeds the timeout value</p>
<div class="gmail_quote">On Jan 10, 2011 6:17 PM, "Sarah Kenny" <<a href="mailto:skenny@uchicago.edu">skenny@uchicago.edu</a>> wrote:<br type="attribution">> so, i'm trying to get nightly.sh to run on pads with coasters and i'm not<br>
> quite sure where this is falling apart. so far the only thing i've edited is<br>> providers/ssh-pbs-coasters/sites.template.xml (allowing it to take the<br>> PROJECT and QUEUE variables). from what i can tell the sites.xml file does<br>
> get generated correctly but then according to the test output it times out<br>> during submission:<br>> <br>> [skenny@login1 tests]$ ./nightly.sh -c -g -s groups/group-ssh.sh<br>> RUNNING_IN:<br>> /ci/projects/cnari/soft/swift_latest/cog/modules/swift/tests/run-2011-01-10<br>
> HTML_OUTPUT:<br>> /ci/projects/cnari/soft/swift_latest/cog/modules/swift/tests/run-2011-01-10/tests-2011-01-10.html<br>> which: no ifconfig in<br>> (/ci/projects/cnari/apps/freesurfer64/bin:/ci/projects/cnari/apps/freesurfer64/fsfast/bin:/ci/projects/cnari/apps/freesurfer64/mni/bin:/ci/projects/cnari/usr/bin:/ci/projects/cnari/apps/afni:/ci/projects/cnari/apps/swift/bin:/soft/java-1.6.0_11-sun-r1/bin:/soft/java-1.6.0_11-sun-r1/jre/bin:/software/common/gx-map-0.5.3.3-r1/bin:/soft/apache-ant-1.7.1-r1/bin:/soft/condor-7.0.5-r1/bin:/soft/globus-4.2.1-r2/bin:/soft/globus-4.2.1-r2/sbin:/usr/kerberos/bin:/bin:/usr/bin:/usr/X11R6/bin:/usr/local/bin:/software/common/softenv-1.6.0-r1/bin:/home/skenny/bin/linux-rhel5-x86_64:/home/skenny/bin:/soft/maui-3.2.6p21-r1/bin:/soft/maui-3.2.6p21-r1/sbin:/soft/openmpi-1.4.2-gcc4.1-r1/bin)<br>
> GROUPLISTFILE: groups/group-ssh.sh<br>> <br>> Prolog: Build<br>> <br>> Executing (part 1)<br>> /ci/projects/cnari/soft/swift_latest/cog/modules/swift/tests<br>> Executing (part 2)<br>> /ci/projects/cnari/soft/swift_latest/cog/modules/swift/tests/cog/modules/swift<br>
> 14815 pts/27 00:00:00 nightly.sh<br>> monitor(1): killing test process...<br>> touch: cannot touch `killed_test': Stale NFS file handle<br>> monitor(1): killed process_exec (TERM)<br>> process_exec_trap()<br>
> killing all swifts...<br>> ++ echo 13685<br>> 13685<br>> ++ ps -f<br>> UID PID PPID C STIME TTY TIME CMD<br>> skenny 14815 1 0 15:49 pts/27 00:00:00 /bin/bash ./nightly.sh -c -g<br>
> -s groups/group-ssh.sh<br>> skenny 14816 1 0 15:49 pts/27 00:00:00 /bin/bash ./nightly.sh -c -g<br>> -s groups/group-ssh.sh<br>> skenny 14879 1 0 15:49 pts/27 00:00:04 java -Xmx2048M<br>> -Djava.endorsed.dirs=/ci/projects/cnari/soft/swift_latest/cog/modules/swift/tests/cog/modules/swift/dist/swift-svn/bin/../lib/<br>
> skenny 15473 23767 0 15:55 pts/27 00:00:00 /bin/bash ./nightly.sh -c -g<br>> -s groups/group-ssh.sh<br>> skenny 15503 15473 7 15:55 pts/27 00:00:08<br>> /soft/java-1.6.0_11-sun-r1/jre/bin/java -classpath<br>
> /soft/apache-ant-1.7.1-r1/lib/ant-launcher.jar<br>> -Dant.home=/soft/apache-ant-1.7.1-r1 -Dant.<br>> skenny 15890 14815 0 15:57 pts/27 00:00:00 ps -f<br>> skenny 23767 23760 0 13:53 pts/27 00:00:00 -bash<br>
> ./nightly.sh: line 588: 14819 Killed "$@" > $OUTPUT 2>&1<br>> +++ ps -f<br>> +++ grep '.*java'<br>> +++ grep -v grep<br>> ++ kill_this skenny 14879 1 0 15:49 pts/27 00:00:04 java -Xmx2048M<br>
> -Djava.endorsed.dirs=/ci/projects/cnari/soft/swift_latest/cog/modules/swift/tests/cog/modules/swift/dist/swift-svn/bin/../lib/endorsed<br>> -DUID=1195 -DGLOBUS_TCP_PORT_RANGE=50000,51000 -DGLOBUS_HOSTNAME=<br>> login1.pads.ci.uchicago.edu-DCOG_INSTALL_PATH=/ci/projects/cnari/soft/swift_latest/cog/modules/swift/tests/cog/modules/swift/dist/swift-svn/bin/..<br>
> -Dswift.home=/ci/projects/cnari/soft/swift_latest/cog/modules/swift/tests/cog/modules/swift/dist/swift-svn/bin/..<br>> -Djava.security.egd=file:///dev/urandom -classpath<br>> /ci/projects/cnari/soft/swift_latest/cog/modules/swift/tests/cog/modules/swift/dist/swift-svn/bin/../etc:/ci/projects/cnari/soft/swift_latest/cog/modules/swift/tests/cog/modules/swift/dist/swift-svn/bin/../libexec:/ci/projects/cnari/soft/swift_latest/cog/modules/swift/tests/cog/modules/swift/dist/swift-svn/bin/../lib/addressing-1.0.jar:/ci/projects/cnari/soft/swift_latest/cog/modules/swift/tests/cog/modules/swift/dist/swift-svn/bin/../lib/antlr-2.7.5.jar:/ci/projects/cnari/soft/swift_latest/cog/modules/swift/tests/cog/modules/swift/dist/swift-svn/bin/../lib/axis.jar:/ci/projects/cnari/soft/swift_latest/cog/modules/swift/tests/cog/modules/swift/dist/swift-svn/bin/../lib/axis-url.jar:/ci/projects/cnari/soft/swift_latest/cog/modules/swift/tests/cog/modules/swift/dist/swift-svn/bin/../lib/backport-util-concurrent.jar:/ci/projects/cnari/soft/swift_latest/cog/modules/swift/tests/cog/modules/swift/dist/swift-svn/bin/../lib/coaster-bootstrap.jar:/ci/projects/cnari/soft/swift_latest/cog/modules/swift/tests/cog/modules/swift/dist/swift-svn/bin/../lib/cog-abstraction-common-2.4.jar:/ci/projects/cnari/soft/swift_latest/cog/modules/swift/tests/cog/modules/swift/dist/swift-svn/bin/../lib/cog-axis.jar:/ci/projects/cnari/soft/swift_latest/cog/modules/swift/tests/cog/modules/swift/dist/swift-svn/bin/../lib/cog-grapheditor-0.47.jar:/ci/projects/cnari/soft/swift_latest/cog/modules/swift/tests/cog/modules/swift/dist/swift-svn/bin/../lib/cog-jglobus-1.7.0.jar:/ci/projects/cnari/soft/swift_latest/cog/modules/swift/tests/cog/modules/swift/dist/swift-svn/bin/../lib/cog-karajan-0.36-dev.jar:/ci/projects/cnari/soft/swift_latest/cog/modules/swift/tests/cog/modules/swift/dist/swift-svn/bin/../lib/cog-provider-clref-gt4_0_0.jar:/ci/projects/cnari/soft/swift_latest/cog/modules/swift/tests/cog/modules/swift/dist/swift-svn/bin/../lib/cog-provider-coaster-0.3.jar:/ci/projects/cnari/soft/swift_latest/cog/modules/swift/tests/cog/modules/swift/dist/swift-svn/bin/../lib/cog-provider-dcache-0.1.jar:/ci/projects/cnari/soft/swift_latest/cog/modules/swift/tests/cog/modules/swift/dist/swift-svn/bin/../lib/cog-provider-gt2-2.4.jar:/ci/projects/cnari/soft/swift_latest/cog/modules/swift/tests/cog/modules/swift/dist/swift-svn/bin/../lib/cog-provider-gt4_0_0-2.5.jar:/ci/projects/cnari/soft/swift_latest/cog/modules/swift/tests/cog/modules/swift/dist/swift-svn/bin/../lib/cog-provider-local-2.2.jar:/ci/projects/cnari/soft/swift_latest/cog/modules/swift/tests/cog/modules/swift/dist/swift-svn/bin/../lib/cog-provider-localscheduler-0.4.jar:/ci/projects/cnari/soft/swift_latest/cog/modules/swift/tests/cog/modules/swift/dist/swift-svn/bin/../lib/cog-provider-ssh-2.4.jar:/ci/projects/cnari/soft/swift_latest/cog/modules/swift/tests/cog/modules/swift/dist/swift-svn/bin/../lib/cog-provider-webdav-2.1.jar:/ci/projects/cnari/soft/swift_latest/cog/modules/swift/tests/cog/modules/swift/dist/swift-svn/bin/../lib/cog-resources-1.0.jar:/ci/projects/cnari/soft/swift_latest/cog/modules/swift/tests/cog/modules/swift/dist/swift-svn/bin/../lib/cog-swift-svn.jar:/ci/projects/cnari/soft/swift_latest/cog/modules/swift/tests/cog/modules/swift/dist/swift-svn/bin/../lib/cog-trap-1.0.jar:/ci/projects/cnari/soft/swift_latest/cog/modules/swift/tests/cog/modules/swift/dist/swift-svn/bin/../lib/cog-url.jar:/ci/projects/cnari/soft/swift_latest/cog/modules/swift/tests/cog/modules/swift/dist/swift-svn/bin/../lib/cog-util-0.92.jar:/ci/projects/cnari/soft/swift_latest/cog/modules/swift/tests/cog/modules/swift/dist/swift-svn/bin/../lib/commonj.jar:/ci/projects/cnari/soft/swift_latest/cog/mo<br>
> skenny 15503 15473 7 15:55 pts/27 00:00:08<br>> /soft/java-1.6.0_11-sun-r1/jre/bin/java -classpath<br>> /soft/apache-ant-1.7.1-r1/lib/ant-launcher.jar<br>> -Dant.home=/soft/apache-ant-1.7.1-r1<br>> -Dant.library.dir=/soft/apache-ant-1.7.1-r1/lib<br>
> org.apache.tools.ant.launch.Launcher -cp :./ -quiet dist<br>> ++ '[' -n 14879 ']'<br>> ++ /bin/kill -KILL 14879<br>> ++ set +x<br>> Executing Package (part 3)<br>> /ci/projects/cnari/soft/swift_latest/cog/modules/swift/tests/cog/modules/swift<br>
> Executing Package (part 4)<br>> /ci/projects/cnari/soft/swift_latest/cog/modules/swift/tests/cog/modules/swift/dist/swift-svn/lib<br>> Executing Package (part 5)<br>> /ci/projects/cnari/soft/swift_latest/cog/modules/swift/tests/cog/modules/swift/dist/swift-svn/lib<br>
> Executing Package (part 6)<br>> /ci/projects/cnari/soft/swift_latest/cog/modules/swift/tests<br>> /ci/projects/cnari/soft/swift_latest/cog/modules/swift/tests/cog/modules/swift/dist/swift-svn/bin/swift<br>> <br>
> Part 1: SSH with PBS and Coasters Configuration Test<br>> <br>> Using:<br>> /ci/projects/cnari/soft/swift_latest/cog/modules/swift/tests/cog/modules/swift/tests/providers/ssh-pbs-coasters/sites.template.xml<br>
> Using:<br>> /ci/projects/cnari/soft/swift_latest/cog/modules/swift/tests/cog/modules/swift/tests/providers/ssh-pbs-coasters/tc.template.data<br>> `/ci/projects/cnari/soft/swift_latest/cog/modules/swift/tests/cog/modules/swift/dist/swift-svn/etc/swift.properties'<br>
> -> `./swift.properties'<br>> `/ci/projects/cnari/soft/swift_latest/cog/modules/swift/tests/cog/modules/swift/tests/providers/ssh-pbs-coasters/001-catsn-ssh-pbs-coasters.swift'<br>> -> `./001-catsn-ssh-pbs-coasters.swift'<br>
> Executing<br>> /ci/projects/cnari/soft/swift_latest/cog/modules/swift/tests/cog/modules/swift/tests/providers/ssh-pbs-coasters/001-catsn-ssh-pbs-coasters.swift<br>> (part 1)<br>> 16623 pts/27 00:00:00 nightly.sh<br>
> monitor(1): killing test process...<br>> monitor(1): killed process_exec (TERM)<br>> process_exec_trap()<br>> killing all swifts...<br>> ++ echo 15473<br>> 15473<br>> ++ ps -f<br>> UID PID PPID C STIME TTY TIME CMD<br>
> skenny 15473 23767 0 15:55 pts/27 00:00:00 /bin/bash ./nightly.sh -c -g<br>> -s groups/group-ssh.sh<br>> skenny 16623 15473 0 15:58 pts/27 00:00:00 /bin/bash ./nightly.sh -c -g<br>> -s groups/group-ssh.sh<br>
> skenny 16624 15473 0 15:58 pts/27 00:00:00 /bin/bash ./nightly.sh -c -g<br>> -s groups/group-ssh.sh<br>> skenny 16687 1 0 15:58 pts/27 00:00:04 java -Xmx2048M<br>> -Djava.endorsed.dirs=/ci/projects/cnari/soft/swift_latest/cog/modules/swift/tests/cog/modules/swift/dist/swift-svn/bin/../lib/<br>
> skenny 17414 16623 0 16:06 pts/27 00:00:00 ps -f<br>> skenny 23767 23760 0 13:53 pts/27 00:00:00 -bash<br>> ./nightly.sh: line 588: 16627 Killed "$@" > $OUTPUT 2>&1<br>
> +++ ps -f<br>> +++ grep '.*java'<br>> +++ grep -v grep<br>> ++ kill_this skenny 16687 1 0 15:58 pts/27 00:00:04 java -Xmx2048M<br>> -Djava.endorsed.dirs=/ci/projects/cnari/soft/swift_latest/cog/modules/swift/tests/cog/modules/swift/dist/swift-svn/bin/../lib/endorsed<br>
> -DUID=1195 -DGLOBUS_TCP_PORT_RANGE=50000,51000 -DGLOBUS_HOSTNAME=<br>> login1.pads.ci.uchicago.edu-DCOG_INSTALL_PATH=/ci/projects/cnari/soft/swift_latest/cog/modules/swift/tests/cog/modules/swift/dist/swift-svn/bin/..<br>
> -Dswift.home=/ci/projects/cnari/soft/swift_latest/cog/modules/swift/tests/cog/modules/swift/dist/swift-svn/bin/..<br>> -Djava.security.egd=file:///dev/urandom -classpath<br>> /ci/projects/cnari/soft/swift_latest/cog/modules/swift/tests/cog/modules/swift/dist/swift-svn/bin/../etc:/ci/projects/cnari/soft/swift_latest/cog/modules/swift/tests/cog/modules/swift/dist/swift-svn/bin/../libexec:/ci/projects/cnari/soft/swift_latest/cog/modules/swift/tests/cog/modules/swift/dist/swift-svn/bin/../lib/addressing-1.0.jar:/ci/projects/cnari/soft/swift_latest/cog/modules/swift/tests/cog/modules/swift/dist/swift-svn/bin/../lib/antlr-2.7.5.jar:/ci/projects/cnari/soft/swift_latest/cog/modules/swift/tests/cog/modules/swift/dist/swift-svn/bin/../lib/axis.jar:/ci/projects/cnari/soft/swift_latest/cog/modules/swift/tests/cog/modules/swift/dist/swift-svn/bin/../lib/axis-url.jar:/ci/projects/cnari/soft/swift_latest/cog/modules/swift/tests/cog/modules/swift/dist/swift-svn/bin/../lib/backport-util-concurrent.jar:/ci/projects/cnari/soft/swift_latest/cog/modules/swift/tests/cog/modules/swift/dist/swift-svn/bin/../lib/coaster-bootstrap.jar:/ci/projects/cnari/soft/swift_latest/cog/modules/swift/tests/cog/modules/swift/dist/swift-svn/bin/../lib/cog-abstraction-common-2.4.jar:/ci/projects/cnari/soft/swift_latest/cog/modules/swift/tests/cog/modules/swift/dist/swift-svn/bin/../lib/cog-axis.jar:/ci/projects/cnari/soft/swift_latest/cog/modules/swift/tests/cog/modules/swift/dist/swift-svn/bin/../lib/cog-grapheditor-0.47.jar:/ci/projects/cnari/soft/swift_latest/cog/modules/swift/tests/cog/modules/swift/dist/swift-svn/bin/../lib/cog-jglobus-1.7.0.jar:/ci/projects/cnari/soft/swift_latest/cog/modules/swift/tests/cog/modules/swift/dist/swift-svn/bin/../lib/cog-karajan-0.36-dev.jar:/ci/projects/cnari/soft/swift_latest/cog/modules/swift/tests/cog/modules/swift/dist/swift-svn/bin/../lib/cog-provider-clref-gt4_0_0.jar:/ci/projects/cnari/soft/swift_latest/cog/modules/swift/tests/cog/modules/swift/dist/swift-svn/bin/../lib/cog-provider-coaster-0.3.jar:/ci/projects/cnari/soft/swift_latest/cog/modules/swift/tests/cog/modules/swift/dist/swift-svn/bin/../lib/cog-provider-dcache-0.1.jar:/ci/projects/cnari/soft/swift_latest/cog/modules/swift/tests/cog/modules/swift/dist/swift-svn/bin/../lib/cog-provider-gt2-2.4.jar:/ci/projects/cnari/soft/swift_latest/cog/modules/swift/tests/cog/modules/swift/dist/swift-svn/bin/../lib/cog-provider-gt4_0_0-2.5.jar:/ci/projects/cnari/soft/swift_latest/cog/modules/swift/tests/cog/modules/swift/dist/swift-svn/bin/../lib/cog-provider-local-2.2.jar:/ci/projects/cnari/soft/swift_latest/cog/modules/swift/tests/cog/modules/swift/dist/swift-svn/bin/../lib/cog-provider-localscheduler-0.4.jar:/ci/projects/cnari/soft/swift_latest/cog/modules/swift/tests/cog/modules/swift/dist/swift-svn/bin/../lib/cog-provider-ssh-2.4.jar:/ci/projects/cnari/soft/swift_latest/cog/modules/swift/tests/cog/modules/swift/dist/swift-svn/bin/../lib/cog-provider-webdav-2.1.jar:/ci/projects/cnari/soft/swift_latest/cog/modules/swift/tests/cog/modules/swift/dist/swift-svn/bin/../lib/cog-resources-1.0.jar:/ci/projects/cnari/soft/swift_latest/cog/modules/swift/tests/cog/modules/swift/dist/swift-svn/bin/../lib/cog-swift-svn.jar:/ci/projects/cnari/soft/swift_latest/cog/modules/swift/tests/cog/modules/swift/dist/swift-svn/bin/../lib/cog-trap-1.0.jar:/ci/projects/cnari/soft/swift_latest/cog/modules/swift/tests/cog/modules/swift/dist/swift-svn/bin/../lib/cog-url.jar:/ci/projects/cnari/soft/swift_latest/cog/modules/swift/tests/cog/modules/swift/dist/swift-svn/bin/../lib/cog-util-0.92.jar:/ci/projects/cnari/soft/swift_latest/cog/modules/swift/tests/cog/modules/swift/dist/swift-svn/bin/../lib/commonj.jar:/ci/projects/cnari/soft/swift_latest/cog/mo<br>
> ++ '[' -n 16687 ']'<br>> ++ /bin/kill -KILL 16687<br>> ++ set +x<br>> kill 16624: No such process<br>> TOOK: 500<br>> FAILED<br>> Swift svn swift-r3921 (swift modified locally) cog-r3013<br>
> <br>> RunID: 20110110-1558-ojtlnxfb<br>> Progress:<br>> Progress: Selecting site:9 Initializing site shared directory:1<br>> Progress: Selecting site:9 Initializing site shared directory:1<br>> Progress: Selecting site:9 Initializing site shared directory:1<br>
> Progress: Selecting site:9 Initializing site shared directory:1<br>> Progress: Selecting site:9 Initializing site shared directory:1<br>> Progress: Selecting site:9 Initializing site shared directory:1<br>
> Progress: Selecting site:9 Initializing site shared directory:1<br>> Progress: Selecting site:9 Initializing site shared directory:1<br>> Progress: Selecting site:9 Initializing site shared directory:1<br>
> Progress: Selecting site:9 Initializing site shared directory:1<br>> Progress: Selecting site:9 Initializing site shared directory:1<br>> Progress: Selecting site:9 Initializing site shared directory:1<br>
> Progress: Selecting site:9 Initializing site shared directory:1<br>> Progress: Selecting site:9 Initializing site shared directory:1<br>> Progress: Selecting site:9 Initializing site shared directory:1<br>
> Progress: Selecting site:9 Initializing site shared directory:1<br>> nightly.sh: monitor(1): killed: exceeded 500 seconds<br>> FAILED<br>> ++ seq --format %04.f 1 1 10<br>> + for count in '`seq --format "%04.f" 1 1 10`'<br>
> + '[' -f catsn.0001.out ']'<br>> + exit 1<br>> <br>> ----------------------------------------------------------------<br>> <br>> i'm running this directly on the pads login and seeing this in the swift<br>
> log:<br>> <br>> 2011-01-10 16:33:18,539-0600 INFO TransportProtocolCommon The Transport<br>> Protocol thread<br>> failed<br>> <br>> java.io.IOException: The socket is<br>> EOF<br>> <br>> at<br>
> com.sshtools.j2ssh.transport.TransportProtocolInputStream.readBufferedData(TransportProtocolInputStream.java:183)<br>> <br>> at<br>> com.sshtools.j2ssh.transport.TransportProtocolInputStream.readMessage(TransportProtocolInputStream.java:226)<br>
> <br>> at<br>> com.sshtools.j2ssh.transport.TransportProtocolCommon.processMessages(TransportProtocolCommon.java:1440)<br>> <br>> at<br>> com.sshtools.j2ssh.transport.TransportProtocolCommon.startBinaryPacketProtocol(TransportProtocolCommon.java:1034)<br>
> <br>> at<br>> com.sshtools.j2ssh.transport.TransportProtocolCommon.run(TransportProtocolCommon.java:393)<br>> <br>> at<br>> java.lang.Thread.run(Thread.java:619)<br>> <br>> <br>> <br>
> you can view the test output here:<br>> <br>> <a href="http://www.ci.uchicago.edu/~skenny/swift_tests/run-2011-01-10/tests-2011-01-10.html">http://www.ci.uchicago.edu/~skenny/swift_tests/run-2011-01-10/tests-2011-01-10.html</a><br>
> <br>> anyway, thought i'd post this in case there's something that might jump out<br>> at any of you that i can tweak...<br>> <br>> ~sk<br></div>