[Swift-user] Tuning parameters of coaster execution

Andriy Fedorov fedorov at bwh.harvard.edu
Tue Jan 19 12:01:15 CST 2010


Mihael,

The script is very simple:

iterate cnt {
  doStuff
} until (cnt<100);

I thought this is a parallel construct. Was I wrong?

--
Andriy Fedorov, Ph.D.

Research Fellow
Brigham and Women's Hospital
Harvard Medical School
75 Francis Street
Boston, MA 02115 USA
fedorov at bwh.harvard.edu



On Tue, Jan 19, 2010 at 12:43, Mihael Hategan <hategan at mcs.anl.gov> wrote:
> On Tue, 2010-01-19 at 10:38 -0500, Andriy Fedorov wrote:
>> Hi Mihael,
>>
>> I've been playing with this following your suggestions, but I can't
>> get it to work.
>>
>> Here's my site description:
>>
>> <pool handle="Abe-GT2-coasters">
>>   <gridftp  url="local://localhost" />
>>   <execution provider="coaster" jobmanager="gt2:gt2:pbs"
>> url="grid-abe.ncsa.teragrid.org"/>
>>   <workdirectory>/u/ac/fedorov/scratch-global/scratch</workdirectory>
>>   <profile namespace="karajan" key="jobThrottle">2.55</profile>
>>   <profile namespace="karajan" key="initialScore">10000</profile>
>>   <profile namespace="globus" key="nodeGranularity">10</profile>
>>   <profile namespace="globus" key="remoteMonitorEnabled">false</profile>
>>   <profile namespace="globus" key="parallelism">0.1</profile>
>>   <profile namespace="globus" key="workersPerNode">2</profile>
>>   <profile namespace="globus" key="highOverallocation">10</profile>
>> </pool>
>>
>> My maxWalltime for the job is 2, and I have 100 of them. When I run
>> the script, I see one job in the queue, with 10 nodes and 22 minutes
>> walltime. However, when the script is executing, it appears the jobs
>> are being scheduled one at a time. I have the current checkout of the
>> cog/swift trunk: Swift svn swift-r3202 cog-r2682. I attach the
>> coaster.log file for your reference.
>
> It doesn't look like Swift is sending more than one job at a time. It
> may be helpful to understand what the swift part is doing (i.e. swift
> log, the swift script, etc.).
>
>>
>> Can you help me understand what I am doing wrong?
>>
>> Also, I was trying to look in the code that does allocation, and it
>> seems that the code responsible for determining the block size for
>> allocation is in
>> modules/provider-coaster/src/org/globus/cog/abstraction/coaster/service/job/manager/BlockQueueProcessor.java.
>> Is this correct? And what is the piece of code that decides how to
>> schedule jobs within the allocated block?
>
> Each worker will slurp jobs fitting (walltime <
> worker_remaining_walltime) from the coaster queue if that's not empty.
> So there isn't much in the way of scheduling at that point.
>
>
>



More information about the Swift-user mailing list