[Swift-user] running swift k on multiple instances on EC2, using coaster-service

Ketan Maheshwari ketancmaheshwari at gmail.com
Tue Oct 23 18:01:15 CDT 2012


Hi Iman,

On the worker nodes, do you see worker.pl running? That must be running for
any work to happen on those nodes.

Another possibility is that the workers on nodes are not seeing the service
running on 10.x.y.z ip. If the service is running on an EC2 node, you will
see another ip which you might try by putting in your sites file service
url.


On Tue, Oct 23, 2012 at 6:46 PM, Iman Sadooghi <isadoogh at iit.edu> wrote:

> Hi everyone
>
> I am trying to run a Montage application workflow with swift on multiple
> instances of AMAZON EC2.
> So far I was able to set up a cluster, and a PVFS files system shared
> among the nodes ( using FUSE. so I will have POSIX interface on my *swift
> work directory*).
> I have tried running a simple hello.swift example on multiple nodes with
> the coaster. the working directory is the shared folder (supported by PVFS).
> when I run the code using my own tc.data and sites.xml, this will happen:
>
> (my command) ubuntu at ip-10-244-4-101:~/coaster$ swift -tc.file tc.data
> -sites.file sites.xml  ~/swift-0.93/examples/swift/tutorial/hello.swift
> (results:)
> Swift 0.93 swift-r5483 cog-r3339
>
> RunID: 20121023-2200-4d3knr72
> Progress:  time: Tue, 23 Oct 2012 22:00:50 +0000
> Find: http://10.244.4.101:1213
> Find:  keepalive(120), reconnect - http://10.244.4.101:1213
> Passive queue processor initialized. Callback URI is
> http://10.244.4.101:1212
> Progress:  time: Tue, 23 Oct 2012 22:01:20 +0000  Submitted:1
> Progress:  time: Tue, 23 Oct 2012 22:01:50 +0000  Submitted:1
> Progress:  time: Tue, 23 Oct 2012 22:02:20 +0000  Submitted:1
> Progress:  time: Tue, 23 Oct 2012 22:02:50 +0000  Submitted:1
> Progress:  time: Tue, 23 Oct 2012 22:03:20 +0000  Submitted:1
> Progress:  time: Tue, 23 Oct 2012 22:03:50 +0000  Submitted:1
> Progress:  time: Tue, 23 Oct 2012 22:04:20 +0000  Submitted:1
>
> and it keeps doing this forever meaning that there is no answer from
> worker nodes!
> as I checked on worker nodes, the working files are created on the shared
> folder, and when i check the running applications, there is a java
> application running. but nothing happens.
> I have also attached the log file of my hello.swift running in case you
> need to take a look at it.
> should I consider using pbs, or condor,... I have no idea about how they
> work though.
>
> I appreciate if anyone can help me with it. Thank you so much.
>
> Best,
> --
> Iman Sadooghi
> Illinois Institute of Technology (IIT)
> Data-Intensive Distributed Systems Laboratory
>
>
> _______________________________________________
> Swift-user mailing list
> Swift-user at ci.uchicago.edu
> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user
>



-- 
Ketan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/swift-user/attachments/20121023/fe0a1f3b/attachment.html>


More information about the Swift-user mailing list