[Swift-user] Why so few nodes ?

Xueyuan Zhou zhouxy at uchicago.edu
Tue Sep 23 11:30:04 CDT 2008


my sites.xml is


 <pool handle="localhost" sysinfo="INTEL32::LINUX">
    <gridftp  url="local://localhost" storage="/var/tmp" major="2" />
    <execution provider="pbs" url="none" />
    <workdirectory >/home/zhouxy/swift/working</workdirectory>
    <profile namespace="globus" key="queue">fast</profile>
  </pool>


and



  <pool handle="teraport" >
    <gridftp  url="gsiftp://tp-grid1.ci.uchicago.edu" />
    <jobmanager universe="vanilla" 
url="tp-grid1.ci.uchicago.edu/jobmanager-pbs" major="2" />
    <workdirectory >/home/zhouxy/swift/working</workdirectory>
    <profile namespace="globus" key="queue">fast</profile>
  </pool>



I run a longer job last night, which is at 
/home/zhouxy/dic_parser/swift_script3,  and it is 10 times jobs more than 
/home/zhouxy/dic_parser/swift_script2

still, except the very beginning and ending(< 4 nodes), the most nodes I can 
get is about 4 nodes, since it seems to be dual core, it is 8 running jobs. 
I also noticed there are some free nodes there. I am wondering why I can 
only have about 8 running job (on 4 nodes), and about 20~30 in Q, while 
dozons of nodes are free.
In this case, it is acting like a little faster single machine.


If I can have more running jobs, that "execute2 tasks, coloured by site" 
will be more steep I think. But it seems swift does not do that.


Thanks.



----- Original Message ----- 
From: "Ben Clifford" <benc at hawaga.org.uk>
To: "Xueyuan Zhou" <zhouxy at uchicago.edu>
Cc: <swift-user at ci.uchicago.edu>
Sent: Tuesday, September 23, 2008 11:02 AM
Subject: Re: [Swift-user] Why so few nodes ?


>
> On Mon, 22 Sep 2008, Xueyuan Zhou wrote:
>
>> /home/zhouxy/dic_parser/swift_script2 has a succesful complete log, with 
>> mush
>> smaller input file than
>> /home/zhouxy/dic_parser/swift_script
>
> I plotted that log file here:
>
> http://www.ci.uchicago.edu/~benc/tmp/report-test3-20080922-2040-h1383jdc/
>
> The log file seems to suggest that between 5 and 12 jobs are running at
> any one time according to the submit side after about 200 seconds into the
> run.
>
> To begin with, each site will only get two nodes at once; as your run
> progresses successfully on a site, that site will be given more nodes.
>
> Have a look at the graph labelled:
>
>    execute2 tasks, coloured by site
>
> and you can see how many jobs are sent to each site at once.
>
> You have two sites defined - localhost and teraport. The localhost
> definition looks a bit suspicious, though - the usual swift localhost
> definition never allows more than 2 jobs to run at once, but on your run
> you often have more than that.
>
> What does your sites.xml file look like?
>
> -- 
>
>
>
> 




More information about the Swift-user mailing list