[Swift-devel] Re: coaster on EC2 (error log)
Yi Zhu
yizhu at cs.uchicago.edu
Tue Apr 27 13:18:33 CDT 2010
Hi
I got a problem when I set the the provider to coaster, I got the the
following error:
Progress:
Progress: Stage in:1
Progress: Submitted:1
Failed to transfer wrapper log from first-20100427-1308-o7o8r3c1/info/d
on ec2
Execution failed:
Exception in echo:
Arguments: [Hello, world!]
Host: ec2
Directory: first-20100427-1308-o7o8r3c1/jobs/d/echo-dfktx5rj
stderr.txt:
stdout.txt:
----
Caused by:
Could not submit job
Caused by:
Could not start coaster service
Caused by:
Task ended before registration was received.
STDOUT:
STDERR:
Caused by:
Job failed with an exit code of 1
Cleaning up...
Done
-bash-3.2$
I also checked the coaster log in server node, it shows it need a binary
file called /gmd5sum/, I searched Google and found that gmd5sum is a
windows-base jar file, maybe coaster need /md5sum/ instead of
/gmd5sum/? ( md5sum is installed on host server by default)
[torqueuser at ip-10-251-214-179 ~]$ cat coaster-bootstrap-11894108087.log
using plain mode
BS: http://tp-login2.ci.uchicago.edu:37470
which: no gmd5sum in
(/opt/vdt-1.10.1/gums/scripts:/opt/vdt-1.10.1/prima/bin:/opt/vdt-1.10.1/cert-scripts/bin:/opt/vdt-1.10.1/glite/sbin:/opt/vdt-1.10.1/glite/bin:/opt/vdt-1.10.1/jdk1.5/bin:/opt/vdt-1.10.1/edg/sbin:/opt/vdt-1.10.1/gip/bin:/opt/vdt-1.10.1/gpt/sbin:/opt/vdt-1.10.1/globus/bin:/opt/vdt-1.10.1/globus/sbin:/opt/vdt-1.10.1/wget/bin:/opt/vdt-1.10.1/logrotate/sbin:/opt/vdt-1.10.1/perl/bin:/opt/pacman-3.26/bin:/opt/vdt-1.10.1/vdt/sbin:/opt/vdt-1.10.1/vdt/bin:/opt/pacman-3.26/bin:/usr/local/bin:/bin:/usr/bin)
Expected checksum: acab90e149a0188fbc963803a42156c5
Computed checksum: acab90e149a0188fbc963803a42156c5
JAVA=/opt/vdt-1.10.1/jdk1.5/bin/java
plain /opt/vdt-1.10.1/jdk1.5/bin/java
-Djava=/opt/vdt-1.10.1/jdk1.5/bin/java -DGLOBUS_TCP_PORT_RANGE=
-DX509_USER_PROXY= -DX509_CERT_DIR=/opt/vdt-1.10.1/globus/TRUSTED_CA
-DGLOBUS_HOSTNAME=ec2-204-236-204-71.compute-1.amazonaws.com -jar
/tmp/bootstrap.Y19911 http://tp-login2.ci.uchicago.edu:37470
https://128.135.125.117:35183 11894108087
the sites.xml files I used:
<pool handle="ec2">
<execution provider="coaster" url="ec2-204-236-204-71.compute-1.amazonaws\
.com" jobmanager="ssh:pbs"/>
<profile namespace="globus" key="workersPerNode">1</profile>
<profile namespace="globus" key="slots">1</profile>
<profile namespace="globus" key="nodeGranularity">5</profile>
<profile namespace="globus" key="maxNodes">5</profile>
<profile namespace="karajan" key="jobThrottle">1</profile>
<profile namespace="karajan" key="initialScore">10000</profile>
<filesystem provider="ssh" url="ec2-204-236-204-71.compute-1.amazonaws.co\
m"/>
<workdirectory>/home/torqueuser/swiftwork</workdirectory>
</pool>
-Yi
On 4/26/2010 7:42 PM, Michael Wilde wrote:
> SOunds great, Yi - thanks for the update, and David for the assistance.
>
> Yes, we can meet tomorrow. Is 3:30 - 4:30 OK?
>
> In the meantime, can you send me the sites.xml file you used for coasters, and point me to the directory that contains stdout/err and the swift .log file for the failing run? (Please send this to swift-devel with a description of what you did. That way Mihael and others can help as well.)
>
> Thanks,
>
> Mike
>
>
> ----- "Yi Zhu"<yizhu at cs.uchicago.edu> wrote:
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/swift-devel/attachments/20100427/ba0b2fcd/attachment.html>
More information about the Swift-devel
mailing list