[Swift-user] running jobs on cluster or cloud

Yadu Nand yadudoc1729 at gmail.com
Fri Aug 29 10:43:00 CDT 2014


Hi Justin,

​​Did you do the following steps:
export WORKER_LOCATION="/home/ubuntu"
export WORKER_HOSTS="<IP of machine 1> <IP of machine 2>"
export WORKER_USERNAME=ubuntu

and then run "source setup.sh" ?
When you source the setup.sh scripts you must've gotten a sites.xml and a
start-coaster-service.log in your scs folder, could you send us those ?
The setup script should start a persistent coaster service and connect to
the nodes on amazon, start workers, and generate a sites.xml file
that would let your swift scripts run across the amazon nodes. You
shouldn't have to make changes to the sites.xml.

 Alternatively, you could try using the beta release of swift, Swift 0.95
RC6 with the new cloud mechanism:
https://github.com/swift-lang/swift-on-cloud/tree/master/aws

That will set you up with a headnode on AWS with a few worker nodes that
you define, with everything setup to run swift.


Thanks,
Yadu
​


On Thu, Aug 28, 2014 at 6:57 PM, Justin bbt <justinbbt at gmail.com> wrote:

>
>
>
> Hi all,
>>
>>  I could successfully run swift on my local system.
>> Next, I want to use the swift to run some jobs on a cluster.
>>
>> I followed this tutorial.  (I am using just a simple cluster- I even
>> could not run the job on one remote node of the cluster)
>> http://swift-lang.org/tutorials/cloud/tutorial.html
>>
>> But, I get this when I run swift p1.swift or other swift
>>
>> Swift 0.94.1 swift-r7114 cog-r3803
>>
>> RunID: 20140828-1758-ea4phzag
>> Progress:  time: Thu, 28 Aug 2014 17:58:15 -0400
>> Progress:  time: Thu, 28 Aug 2014 17:58:24 -0400  Submitted:1
>> Execution failed:
>> Exception in simulate:
>>     Arguments: []
>>     Host: remotehost2
>>     Directory: p1-20140828-1758-ea4phzag/jobs/7/simulate-7k2fxlvl
>>
>> Caused by:
>> Job failed with an exit code of 127
>> simulation, p1.swift, line 9
>>
>>
>> --- this is my site.xml file setting
>>
>>    <pool handle="remotehost2">
>>       <execution provider="ssh" jobmanager="ssh:local"
>> url="myclusteturl"/>
>>       <filesystem provider="ssh" url="myclusteturl"/>
>>       <profile namespace="karajan" key="jobThrottle">0</profile>
>>       <profile namespace="karajan" key="initialScore">10000</profile>
>>       <workdirectory>/path/to/remote/workdirectory</workdirectory>
>>    </pool>
>>
>> --- if I use this one
>> <pool handle="persistent-coasters">
>>     <execution provider="coaster-persistent"
>>                url="myclusterurl"
>>                jobmanager="local:local"/>
>>     <profile namespace="globus" key="workerManager">passive</profile>
>>     <profile namespace="globus" key="jobsPerNode">1</profile>
>>     <profile key="jobThrottle" namespace="karajan">10</profile>
>>     <profile namespace="karajan" key="initialScore">10000</profile>
>>     <filesystem provider="local" url="none" />
>>     <workdirectory>.l</workdirectory>
>>   </pool>
>> --- then it loops to my localhost and just repeat submitting the jobs
>>
>> 1. Is this a correct setting?
>> 2. Should I use coaster? I could not understand the description in user
>> guides and documentation about the concepts of coaster and the required
>> setting. Is there any better tutorial which would describe the coaster ?
>> 3. I plan to use the swift later on the cloud (Microsoft Azure). What are
>> the setting required for that? for site.xml and if any other file
>>
>>
>> Thanks in Advance.
>>
>>
>
>
> _______________________________________________
> Swift-user mailing list
> Swift-user at ci.uchicago.edu
> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user
>



-- 
Yadu Nand B
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/swift-user/attachments/20140829/85744414/attachment.html>


More information about the Swift-user mailing list