[Swift-devel] Re: coaster on EC2 (error log)

Yi Zhu yizhu at cs.uchicago.edu
Tue Apr 27 13:18:33 CDT 2010


Hi

I got a problem when I set the the provider to coaster, I got the the 
following error:

Progress:
Progress:  Stage in:1
Progress:  Submitted:1
Failed to transfer wrapper log from first-20100427-1308-o7o8r3c1/info/d 
on ec2
Execution failed:
         Exception in echo:
Arguments: [Hello, world!]
Host: ec2
Directory: first-20100427-1308-o7o8r3c1/jobs/d/echo-dfktx5rj
stderr.txt:

stdout.txt:

----

Caused by:
         Could not submit job
Caused by:
         Could not start coaster service
Caused by:
         Task ended before registration was received.
STDOUT:
STDERR:
Caused by:
         Job failed with an exit code of 1
Cleaning up...
  Done
-bash-3.2$

I also checked the coaster log in server node, it shows it need a binary 
file called /gmd5sum/,  I searched Google and found that gmd5sum is a 
windows-base  jar file,  maybe coaster need /md5sum/ instead of 
/gmd5sum/? ( md5sum is installed on host server by default)

[torqueuser at ip-10-251-214-179 ~]$ cat coaster-bootstrap-11894108087.log
using plain mode
BS: http://tp-login2.ci.uchicago.edu:37470
which: no gmd5sum in 
(/opt/vdt-1.10.1/gums/scripts:/opt/vdt-1.10.1/prima/bin:/opt/vdt-1.10.1/cert-scripts/bin:/opt/vdt-1.10.1/glite/sbin:/opt/vdt-1.10.1/glite/bin:/opt/vdt-1.10.1/jdk1.5/bin:/opt/vdt-1.10.1/edg/sbin:/opt/vdt-1.10.1/gip/bin:/opt/vdt-1.10.1/gpt/sbin:/opt/vdt-1.10.1/globus/bin:/opt/vdt-1.10.1/globus/sbin:/opt/vdt-1.10.1/wget/bin:/opt/vdt-1.10.1/logrotate/sbin:/opt/vdt-1.10.1/perl/bin:/opt/pacman-3.26/bin:/opt/vdt-1.10.1/vdt/sbin:/opt/vdt-1.10.1/vdt/bin:/opt/pacman-3.26/bin:/usr/local/bin:/bin:/usr/bin)
Expected checksum: acab90e149a0188fbc963803a42156c5
Computed checksum: acab90e149a0188fbc963803a42156c5
JAVA=/opt/vdt-1.10.1/jdk1.5/bin/java
plain /opt/vdt-1.10.1/jdk1.5/bin/java 
-Djava=/opt/vdt-1.10.1/jdk1.5/bin/java -DGLOBUS_TCP_PORT_RANGE= 
-DX509_USER_PROXY= -DX509_CERT_DIR=/opt/vdt-1.10.1/globus/TRUSTED_CA 
-DGLOBUS_HOSTNAME=ec2-204-236-204-71.compute-1.amazonaws.com -jar 
/tmp/bootstrap.Y19911 http://tp-login2.ci.uchicago.edu:37470 
https://128.135.125.117:35183 11894108087

the sites.xml files I used:


<pool handle="ec2">
<execution provider="coaster" url="ec2-204-236-204-71.compute-1.amazonaws\
.com" jobmanager="ssh:pbs"/>

<profile namespace="globus" key="workersPerNode">1</profile>
<profile namespace="globus" key="slots">1</profile>
<profile namespace="globus" key="nodeGranularity">5</profile>
<profile namespace="globus" key="maxNodes">5</profile>
<profile namespace="karajan" key="jobThrottle">1</profile>
<profile namespace="karajan" key="initialScore">10000</profile>

<filesystem provider="ssh" url="ec2-204-236-204-71.compute-1.amazonaws.co\
m"/>
<workdirectory>/home/torqueuser/swiftwork</workdirectory>
</pool>




-Yi


On 4/26/2010 7:42 PM, Michael Wilde wrote:
> SOunds great, Yi - thanks for the update, and David for the assistance.
>
> Yes, we can meet tomorrow. Is 3:30 - 4:30 OK?
>
> In the meantime, can you send me the sites.xml file you used for coasters, and point me to the directory that contains stdout/err and the swift .log file for the failing run? (Please send this to swift-devel with a description of what you did. That way Mihael and others can help as well.)
>
> Thanks,
>
> Mike
>
>
> ----- "Yi Zhu"<yizhu at cs.uchicago.edu>  wrote:
>
>
>    
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/swift-devel/attachments/20100427/ba0b2fcd/attachment.html>


More information about the Swift-devel mailing list