[Swift-devel] submitting jobs to the queue

Mihael Hategan hategan at mcs.anl.gov
Fri Mar 9 11:41:42 CST 2007


On Fri, 2007-03-09 at 11:40 -0600, Veronika V. Nefedova wrote:
> Oh, actually I am curious (-; Is it possible to  specify somewhere the 
> gridftp parameters (like number of parallel streams )

Not at this time. There is some preliminary support for tgcp files that
can be used to set buffer sizes.

>  to increase the 
> transfer rates ? Or swift decides on that based on the file's size ?

It doesn't. It uses normal transfers.

>  Also, 
> when swift is staging files in/out - does it use the concurrent transfers 
> (all files at once), or each file is transferred separately ?

The transfers happen concurrently, but there are throttles (look at
swift.properties) that limit the number of concurrent transfers
globally.

> 
> Nika
> 
> At 11:27 AM 3/9/2007, Yong Zhao wrote:
> >I have been thinking that the system should be smarter in dealing with
> >such issues, without relying too much on a user's manual intervention. For
> >job submission rate, or transfer rate, if we observe abnormality, for
> >instance: ftp errors due to high transfer rate, the system should be able
> >to slow down automatically. I am not quite sure about how to detect that
> >jobs go through quickly to a scheduler, but if that is the case, the
> >submission rate should be increased automatically.
> >
> >Yong.
> >
> >On Fri, 9 Mar 2007, Tiberiu Stef-Praun wrote:
> >
> > > Knob means "while in progress"
> > > Is that doable ? (Probably extending your rudimentary debugger would do 
> > it).
> > > How about the  following extension: can we easily create hooks
> > > (webservices) into a running swift engine, that would allow this
> > > manipulation with an external client (the knob driver) ?
> > > Having more interactivity with a running workflow is something that
> > > might be appealing for long-running or never-ending workflows, and
> > > would differentiate us from others in a nice way. You would not
> > > believe how many people are working on workflows: everybody and their
> > > brother at the OSG meeting had some offering labeled "workflow". (I'm
> > > exaggerating a bit here)
> > >
> > > Tibi
> > >
> > > On 3/9/07, Mihael Hategan <hategan at mcs.anl.gov> wrote:
> > > > Yes, although we need to come up with a nicer way to do it.
> > > > In libexec/scheduler.xml, change <property name="jobThrottle"
> > > > value="4"/> to value="large number" (not literally).
> > > >
> > > > Mihael
> > > >
> > > > On Fri, 2007-03-09 at 11:06 -0600, Veronika V. Nefedova wrote:
> > > > > Hi, Mihael:
> > > > >
> > > > > Is it possible to remove this feature in the one site case ? For 
> > example,
> > > > > the queue is now almost empty on TG, but I have to wait for 1.5 
> > hours for
> > > > > the rest of my jobs to be submitted (thats the average running time 
> > of my
> > > > > job) - and the queue might be full by that time...
> > > > >
> > > > > Nika
> > > > >
> > > > > At 04:36 PM 3/7/2007, Mihael Hategan wrote:
> > > > > >On Wed, 2007-03-07 at 16:30 -0600, Veronika V. Nefedova wrote:
> > > > > > > Hi,
> > > > > > >
> > > > > > > I've noticed one very strange behavior. For example, I have 68 
> > jobs to be
> > > > > > > submitted to the remote host simultaneously. Swift submits at first
> > > > > > just 26
> > > > > > > jobs. I checked that several times - its always 26 jobs. Then, 
> > when at
> > > > > > > least one job out of those 26 is finished - swift goes ahead 
> > and submits
> > > > > > > the rest (all of those left - 42 in my case).
> > > > > > > Is it a bug or a feature?
> > > > > >
> > > > > >Feature. Although it should probably be tamed down in the one site 
> > case.
> > > > > >Each site has a score that changes based on how it behaves. If a site
> > > > > >completes jobs ok, it gets a higher score in time. If jobs fail on it,
> > > > > >it gets a lower score.
> > > > > >
> > > > > >Now, let's consider the following scenario: 2 sites, one fast one 
> > slow.
> > > > > >With no scores and no limitations, half of the jobs would go to 
> > one, and
> > > > > >half to the other. The workflow finishes when the slow site finishes
> > > > > >half the jobs.
> > > > > >What happens however, is that Swift limits the number of initial jobs,
> > > > > >and does "probing". This allows it to infer some stuff about the sites
> > > > > >by the time it gets to submit lots of jobs. It should yield better
> > > > > >performance on larger workflows with imbalanced sites, which is, I'm
> > > > > >guessing, our main scenario.
> > > > > >
> > > > > > >
> > > > > > > Nika
> > > > > > >
> > > > > > > _______________________________________________
> > > > > > > Swift-devel mailing list
> > > > > > > Swift-devel at ci.uchicago.edu
> > > > > > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
> > > > > > >
> > > > >
> > > > >
> > > >
> > > > _______________________________________________
> > > > Swift-devel mailing list
> > > > Swift-devel at ci.uchicago.edu
> > > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
> > > >
> > >
> > >
> > > --
> > > Tiberiu (Tibi) Stef-Praun, PhD
> > > Research Staff, Computation Institute
> > > 5640 S. Ellis Ave, #405
> > > University of Chicago
> > > http://www-unix.mcs.anl.gov/~tiberius/
> > > _______________________________________________
> > > Swift-devel mailing list
> > > Swift-devel at ci.uchicago.edu
> > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
> > >
> >_______________________________________________
> >Swift-devel mailing list
> >Swift-devel at ci.uchicago.edu
> >http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
> 
> 
> _______________________________________________
> Swift-devel mailing list
> Swift-devel at ci.uchicago.edu
> http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
> 




More information about the Swift-devel mailing list