[Swift-user] running jobs on cluster or cloud

Justin bbt justinbbt at gmail.com
Sat Aug 30 00:28:30 CDT 2014


For cluster:

When I run the start-caoster-service, I receive the following, in which it
asks for password and then says Permission is denied

Start-coaster-service...
Configuration: /home/lenovo/swift-cloud-tutorial/scs/coaster-service.conf
Service address: localhost
Starting coaster-service
Service port: 52809
Local port: 58460
Generating sites.xml
username at ipadress's password:
username at ipadress's password:
Starting worker on username@
lenovo at lenovo-laptop:~/swift-cloud-tutorial/scs$username at ipadress's
password:
Permission denied, please try again.
username at ipadress's password:
Permission denied, please try again.
username at ipadress's password:
Permission denied (publickey,gssapi-keyex,gssapi-with-mic,password).

This happens though I have created my keys with ssh-keygen. (only changed
that I made was to create rsa keys rather than dsa keys - my cluster did
not accept dsa). I can connect with rsa keygen and my passphrase for
regular ssh

The output of my sites.xml  from this partial running of
start-coaster-service is

 <pool handle="persistent-coasters">
    <execution provider="coaster-persistent"
               url="http://localhost:37584"
               jobmanager="local:local"/>
    <profile namespace="globus" key="workerManager">passive</profile>
    <profile namespace="globus" key="jobsPerNode">1</profile>
    <profile key="jobThrottle" namespace="karajan">10</profile>
    <profile namespace="karajan" key="initialScore">10000</profile>
    <filesystem provider="local" url="none" />
    <workdirectory>.</workdirectory>
  </pool>

Using this XML , I just get a sequence of job submission every 30 seconds,
no finished jobs.


BTW, I have a public ip for my cluster and then each compute node has a
local/private ip.
In
 export WORKER_HOSTS="<IP of machine 1> <IP of machine 2>"
currently I just set the public IP address which still I am not successful
with this one node even. I was wondering how should I set the other IPs?
Does it mean that I have to install swift on the cluster?


I will look at the new  release of swift for AWS.


Thanks,
J.





On Fri, Aug 29, 2014 at 11:43 AM, Yadu Nand <yadudoc1729 at gmail.com> wrote:

> Hi Justin,
>
> ​​Did you do the following steps:
> export WORKER_LOCATION="/home/ubuntu"
> export WORKER_HOSTS="<IP of machine 1> <IP of machine 2>"
> export WORKER_USERNAME=ubuntu
>
> and then run "source setup.sh" ?
> When you source the setup.sh scripts you must've gotten a sites.xml and a
> start-coaster-service.log in your scs folder, could you send us those ?
> The setup script should start a persistent coaster service and connect to
> the nodes on amazon, start workers, and generate a sites.xml file
> that would let your swift scripts run across the amazon nodes. You
> shouldn't have to make changes to the sites.xml.
>
>  Alternatively, you could try using the beta release of swift, Swift 0.95
> RC6 with the new cloud mechanism:
> https://github.com/swift-lang/swift-on-cloud/tree/master/aws
>
> That will set you up with a headnode on AWS with a few worker nodes that
> you define, with everything setup to run swift.
>
>
> Thanks,
> Yadu
>>
>
> On Thu, Aug 28, 2014 at 6:57 PM, Justin bbt <justinbbt at gmail.com> wrote:
>
>>
>>
>>
>> Hi all,
>>>
>>>  I could successfully run swift on my local system.
>>> Next, I want to use the swift to run some jobs on a cluster.
>>>
>>> I followed this tutorial.  (I am using just a simple cluster- I even
>>> could not run the job on one remote node of the cluster)
>>> http://swift-lang.org/tutorials/cloud/tutorial.html
>>>
>>> But, I get this when I run swift p1.swift or other swift
>>>
>>> Swift 0.94.1 swift-r7114 cog-r3803
>>>
>>> RunID: 20140828-1758-ea4phzag
>>> Progress:  time: Thu, 28 Aug 2014 17:58:15 -0400
>>> Progress:  time: Thu, 28 Aug 2014 17:58:24 -0400  Submitted:1
>>> Execution failed:
>>> Exception in simulate:
>>>     Arguments: []
>>>     Host: remotehost2
>>>     Directory: p1-20140828-1758-ea4phzag/jobs/7/simulate-7k2fxlvl
>>>
>>> Caused by:
>>> Job failed with an exit code of 127
>>> simulation, p1.swift, line 9
>>>
>>>
>>> --- this is my site.xml file setting
>>>
>>>    <pool handle="remotehost2">
>>>       <execution provider="ssh" jobmanager="ssh:local"
>>> url="myclusteturl"/>
>>>       <filesystem provider="ssh" url="myclusteturl"/>
>>>       <profile namespace="karajan" key="jobThrottle">0</profile>
>>>       <profile namespace="karajan" key="initialScore">10000</profile>
>>>       <workdirectory>/path/to/remote/workdirectory</workdirectory>
>>>    </pool>
>>>
>>> --- if I use this one
>>> <pool handle="persistent-coasters">
>>>     <execution provider="coaster-persistent"
>>>                url="myclusterurl"
>>>                jobmanager="local:local"/>
>>>     <profile namespace="globus" key="workerManager">passive</profile>
>>>     <profile namespace="globus" key="jobsPerNode">1</profile>
>>>     <profile key="jobThrottle" namespace="karajan">10</profile>
>>>     <profile namespace="karajan" key="initialScore">10000</profile>
>>>     <filesystem provider="local" url="none" />
>>>     <workdirectory>.l</workdirectory>
>>>   </pool>
>>> --- then it loops to my localhost and just repeat submitting the jobs
>>>
>>> 1. Is this a correct setting?
>>> 2. Should I use coaster? I could not understand the description in user
>>> guides and documentation about the concepts of coaster and the required
>>> setting. Is there any better tutorial which would describe the coaster ?
>>> 3. I plan to use the swift later on the cloud (Microsoft Azure). What
>>> are the setting required for that? for site.xml and if any other file
>>>
>>>
>>> Thanks in Advance.
>>>
>>>
>>
>>
>> _______________________________________________
>> Swift-user mailing list
>> Swift-user at ci.uchicago.edu
>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user
>>
>
>
>
> --
> Yadu Nand B
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/swift-user/attachments/20140830/09bca7d3/attachment.html>


More information about the Swift-user mailing list