[Swift-devel] coasters won't start

Ketan Maheshwari ketancmaheshwari at gmail.com
Fri Oct 21 14:07:13 CDT 2011


Jon,

If you are going from a remote host to pads via ssh, I am more suspicious,
it is a firewall issue. I am no expert, but check the port status of the
ports coaster service is using  running on the remote host and the workers
are using to connect back to the service.

Since when are you seeing this issue?

Ketan

On Fri, Oct 21, 2011 at 2:02 PM, Jonathan Monette <jonmon at mcs.anl.gov>wrote:

> Thanks.  That was the next thing on my check list to check but wasn't sure
> how to check this.  I wasn't sure how to specify a port range for coasters
> to use.  Does coasters use the GLOBUS_TCP_PORT_RANGE and the
> GLOBUS_SOURCE_PORT_RANGE environment variables for this?
>
> <config>
> <pool handle="localhost">
>    <execution provider="local" />
>    <filesystem provider="local" />
>
>  <workdirectory>/gpfs/pads/swift/jonmon/Swift/work/localhost</workdirectory>
>
>    <profile namespace="karajan" key="jobThrottle">.05</profile>
>
>    <profile namespace="env" key="SWIFT_GEN_SCRIPTS">KEEP</profile>
>  </pool>
>  <pool handle="pads">
>      <execution provider="coaster" jobmanager="ssh:pbs" url="
> login.pads.ci.uchicago.edu" />
>      <filesystem provider="local" />
>      <workdirectory>/gpfs/pads/swift/jonmon/Swift/work/pads</workdirectory>
>
>      <profile namespace="globus" key="project">CI-CCR000013</profile>
>      <profile namespace="globus" key="maxtime">3600</profile>
>      <profile namespace="globus" key="jobsPerNode">1</profile>
>      <profile namespace="globus" key="slots">192</profile>             <!--
> Max number of jobs for the fast queue on PADS => 192 -->
>      <profile namespace="globus" key="nodeGranularity">1</profile>
>      <profile namespace="globus" key="maxNodes">1</profile>
>      <profile namespace="globus" key="queue">fast</profile>
>
>      <profile namespace="karajan" key="jobThrottle">5</profile>
>      <profile namespace="karajan" key="initialScore">10000</profile>
>
>      <profile namespace="env" key="SWIFT_GEN_SCRIPTS">KEEP</profile>
>  </pool>
>      <pool handle="beagle">
>          <execution provider="coaster" jobmanager="ssh:pbs" url="
> login.beagle.ci.uchicago.edu" />
>          <profile namespace="globus" key="project">CI-CCR000013</profile>
>   <filesystem provider="local" />
>
>  <workdirectory>/gpfs/pads/swift/jonmon/Swift/work/beagle</workdirectory>
>
>
>          <profile namespace="globus" key="ppn">24</profile>
>          <profile namespace="globus"
> key="providerAttributes">pbs.aprun;pbs.mpp;depth=24</profile>
>          <profile namespace="globus" key="jobsPerNode">24</profile>
>          <profile namespace="globus" key="maxTime">1000</profile>
>          <profile namespace="globus" key="slots">1</profile>
>          <profile namespace="globus" key="nodeGranularity">1</profile>
>          <profile namespace="globus" key="maxNodes">1</profile>
>
>          <profile namespace="karajan" key="jobThrottle">.63</profile>
>          <profile namespace="karajan" key="initialScore">10000</profile>
>
>          <profile namespace="env" key="SWIFT_GEN_SCRIPTS">KEEP</profile>
>      </pool>
>
> </config>
> On Oct 21, 2011, at 1:58 PM, Ketan Maheshwari wrote:
>
> Jon,
>
> There were some changes in the firewalls rules in terms of allowed open
> ports on various ci machines. A long shot, but may be you want to check on
> that.
>
> Can you paste your sites.xml and I can take a look if I find something.
>
> Ketan
>
>
> On Fri, Oct 21, 2011 at 1:50 PM, Jonathan Monette <jonmon at mcs.anl.gov>wrote:
>
>> Anyone have a thought on this?  Not sure what is wrong.  I can't seem to
>> get coasters registered from PADS or Beagle.  The log also specifies a
>> FileNotFoundException when trying to transfer back the wrapper log.  Does
>> this have something to do with the problem?  I have been assuming that this
>> error was being thrown due to the coaster service not connecting.
>>
>> On Oct 20, 2011, at 3:04 PM, Jonathan Monette wrote:
>>
>> Here is a log saying that the coaster service isn't starting, at least
>> that is what the log is saying.  This is with on PADS with automatic
>> coasters using 0.93RC3.
>> http://www.ci.uchicago.edu/~jonmon/logs/coasters_wont_start.log
>>
>> And here is the coaster log in zipped form
>> http://www.ci.uchicago.edu/~jonmon/logs/coasters.tar.gz
>>
>> All the files used for this run are located in
>> ~jonmon/PADS/Swift/SwiftMontage/m101_tutorial/run.0039 on the ci network.
>>
>>
>>
>> _______________________________________________
>> Swift-devel mailing list
>> Swift-devel at ci.uchicago.edu
>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel
>>
>>
>>
>> _______________________________________________
>> Swift-devel mailing list
>> Swift-devel at ci.uchicago.edu
>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel
>>
>>
>
>
> --
> Ketan
>
>
>
>


-- 
Ketan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/swift-devel/attachments/20111021/c32bc6d8/attachment.html>


More information about the Swift-devel mailing list