[Swift-user] Remote SGE cluster
Igor Russo
igor.souza.russo at gmail.com
Mon May 4 07:51:18 CDT 2015
Hi Yadu,
Thanks again.
I tried your suggestion. Now i'm not getting the previous error, but the
jobs aren't being submitted:
RunID: run001
Progress: Seg, 04 Mai 2015 09:32:54-0300
Progress: Seg, 04 Mai 2015 09:32:55-0300 Submitting:1
Progress: Seg, 04 Mai 2015 09:33:25-0300 Submitting:1
Progress: Seg, 04 Mai 2015 09:33:55-0300 Submitting:1
Progress: Seg, 04 Mai 2015 09:34:25-0300 Submitting:1
Progress: Seg, 04 Mai 2015 09:34:55-0300 Submitting:1
Progress: Seg, 04 Mai 2015 09:35:25-0300 Submitting:1
Progress: Seg, 04 Mai 2015 09:35:55-0300 Submitting:1
Progress: Seg, 04 Mai 2015 09:36:25-0300 Submitting:1
In the the log file, i notice the following errors:
2015-05-04 09:24:06,223-0300 INFO ServiceManager Service does not appear
to be registered with this manager
2015-05-04 09:24:06,223-0300 INFO ServiceManager Coaster service ended.
Reason: null
Thanks,
Igor
2015-05-01 17:47 GMT-03:00 Yadu Nand Babuji <yadunand at uchicago.edu>:
> Hi Igor,
>
> The remote connection system requires that the local machine you run the
> swift client on has
> a public ip address. It looks like swift was not able to guess it and set
> it to http://igor-ubuntu:51251
>
> Could you retry running part04 after doing the next step, and please
> make sure your environment has
> these variables set whenever you run swift to remote systems :
> export GLOBUS_HOSTNAME=<PUBLIC_IP_OF_YOUR_MACHINE>
> export GLOBUS_TCP_PORT_RANGE=50000,51000
>
> Thanks,
> Yadu
>
>
> On 05/01/2015 02:29 PM, Igor Russo wrote:
>
> Hi Yadu,
>
> Thank you very much!
>
> I changed the config file with the data from my cluster.
>
> When executing the 4th part of Swift-tutorial, i'm getting the following
> error:
> "Failed to download bootstrap jar from ..."
>
>
>
> --------------------------------------------------------------------------------
>
> RunID: run031
> Progress: Sex, 01 Mai 2015 15:40:42-0300
> Progress: Sex, 01 Mai 2015 15:40:43-0300 Submitting:1
>
> Execution failed:
> Exception in sort:
> Arguments: [-n, unsorted.txt]
> Host: mmc
> Directory: p4-run031/jobs/s/sort-go28d68m
> exception @ swift-int-staging.k, line: 165
> Caused by:
> exception @ swift-int-staging.k, line: 160
> Caused by: null
> Caused by:
> org.globus.cog.abstraction.impl.common.task.TaskSubmissionException: Could
> not submit job
> Caused by:
> org.globus.cog.abstraction.impl.common.task.TaskSubmissionException: Could
> not start coaster service
> Caused by:
> org.globus.cog.abstraction.impl.common.task.TaskSubmissionException: Task
> ended before registration was received.
> Failed to download bootstrap jar from http://igor-ubuntu:51251
>
> k:assign @ swift.k, line: 174
> Caused by: Exception in sort:
> Arguments: [-n, unsorted.txt]
> Host: mmc
> Directory: p4-run031/jobs/s/sort-go28d68m
> exception @ swift-int-staging.k, line: 165
> Caused by:
> exception @ swift-int-staging.k, line: 160
> Caused by: null
> Caused by:
> org.globus.cog.abstraction.impl.common.task.TaskSubmissionException: Could
> not submit job
> Caused by:
> org.globus.cog.abstraction.impl.common.task.TaskSubmissionException: Could
> not start coaster service
> Caused by:
> org.globus.cog.abstraction.impl.common.task.TaskSubmissionException: Task
> ended before registration was received.
> Failed to download bootstrap jar from http://igor-ubuntu:51251
>
>
> --------------------------------------------------------------------------------
>
> Thanks,
> Igor
>
> 2015-05-01 13:47 GMT-03:00 Yadu Nand Babuji <yadunand at uchicago.edu>:
>
>> Hi Igor,
>>
>> Swift does support SGE clusters, and you can refer to the swift-tutorial
>> for sample code and configurations from this link:
>> https://github.com/swift-lang/swift-tutorial
>>
>> Here's a sample config from our test-suite for Godzilla, an SGE cluster
>> at UChicago:
>>
>> https://github.com/swift-lang/swift-k/blob/master/tests/sites/godzilla/swift.conf
>> You could modify and add this config to the swift.conf file in the
>> swift-tutorial to run
>> Swift on any machine and execute on a remote SGE cluster.
>>
>> SGE is a widely used resource manager and most sites have differences in
>> their setups that make each site unique. If you run into issues with the
>> default
>> swift package, and could provide help in figuring out specifics of your
>> cluster, we
>> will help you adapt the Swift SGE provider to support your cluster.
>>
>> Thanks,
>> Yadu
>>
>>
>>
>> On 04/28/2015 05:09 PM, Igor Russo wrote:
>>
>> Hi All,
>>
>> It is possible to use Swift with a remote SGE/OGE cluster?
>>
>> Regards,
>> Igor
>>
>>
>> _______________________________________________
>> Swift-user mailing listSwift-user at ci.uchicago.eduhttps://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user
>>
>>
>>
>> _______________________________________________
>> Swift-user mailing list
>> Swift-user at ci.uchicago.edu
>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user
>>
>
>
>
> _______________________________________________
> Swift-user mailing listSwift-user at ci.uchicago.eduhttps://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user
>
>
>
> _______________________________________________
> Swift-user mailing list
> Swift-user at ci.uchicago.edu
> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/swift-user/attachments/20150504/4c2411ef/attachment.html>
More information about the Swift-user
mailing list