[Swift-devel] process hogging memory on ranger login

skenny at uchicago.edu skenny at uchicago.edu
Mon Nov 23 19:53:06 CST 2009


>On Mon, 2009-11-23 at 19:01 -0600, skenny at uchicago.edu wrote:
>> >On Fri, 2009-11-20 at 13:01 -0600, skenny at uchicago.edu wrote:
>> >> so, using the latest swift (swift-r3116 cog-r2482) i
submitted
>> >> a 1,179,647-job workflow to ranger...got this far:
>> >> 
>> >> Progress:  Submitted:16378  Active:1  Finished
>> successfully:412482
>> >> 
>> >> was in that state for a while (~24hrs) showing this in the
>> queue:
>> >> 
>> >> 1143906   data       tg457040      Waiting 2944   01:10:00 
>> >> Thu Nov 19 07:22:48
>> >
>> >Logs would help.
>> 
>> so, i was trying to re-run this workflow with the latest swift
>> (swift-r3191 cog-r2620) to try and replicate the error.
>> however, a new error has surfaced...the environment, as
>> specified in my tc.data file, is no-longer being set by swift
>> on the remote end. is it possible this is due to recent
>> changes in swift? i am running the same workflow, same tc &
>> sites files with the newer swift and am getting errors (from
>> the app) due to my LD_LIBRARY_PATH not being set. if i switch
>> back to swift-r3116 cog-r2482, the error goes away.
>
>Yes. That is possible.
>
>I will check.
>
>Can you post your sites file?

sure, i tried on both ranger and mercury:

<pool handle="RANGER">
    <profile namespace="karajan"
key="jobThrottle">1000.0</profile>
    <filesystem provider="coaster"
url="gt2://gatekeeper.ranger.tacc.teragrid.org"/>
    <profile namespace="globus" key="queue">normal</profile>
    <profile namespace="globus" key="workersPerNode">32</profile>
    <profile namespace="globus" key="nodeGranularity">1</profile>
    <profile namespace="globus" key="slots">16</profile>
    <profile namespace="globus" key="maxNodes">256</profile>
    <profile namespace="globus" key="maxTime">72000</profile>
    <profile namespace="globus"
key="project">TG-DBS080004N</profile>
    <execution provider="coaster"
url="gatekeeper.ranger.tacc.teragrid.org"
jobManager="gt2:gt2:SGE"/>
   
<workdirectory>/work/00043/tg457040/sidgrid_out/{username}</workdirectory>
  </pool>


<pool handle="NCSAMERCURY">
    <profile namespace="karajan"
key="jobThrottle">1000.0</profile>
    <gridftp url="gsiftp://gridftp-hg.ncsa.teragrid.org"/>
    <profile namespace="globus" key="workersPerNode">2</profile>
    <profile namespace="globus" key="nodeGranularity">1</profile>
    <profile namespace="globus" key="slots">2</profile>
    <profile namespace="globus" key="maxNodes">200</profile>
    <profile namespace="globus" key="maxTime">72000</profile>
    <profile namespace="globus"
key="project">TG-DBS080005N</profile>
    <execution provider="coaster"
url="grid-hg.ncsa.teragrid.org" jobManager="gt2:gt2:PBS"/>
   
<workdirectory>/usr/projects/tg-community/SIDGrid/sidgrid_out/{username}</workdirectory>
  </pool>




More information about the Swift-devel mailing list