[Swift-user] throttle transfers and vdl:stagein graphs

Mihael Hategan hategan at mcs.anl.gov
Wed Dec 8 21:45:44 CST 2010


Throttling happens in the scheduler.

On Wed, 2010-12-08 at 10:44 -0600, Allan Espinosa wrote:
> I see this in doStagein:
> 
> 			uParallelFor(file, files
> 				provider := provider(file)
> 				srchost := hostname(file)
> 				srcdir := vdl:dirname(file)
> 				destdir := dircat(dir, reldirname(file))
> 				filename := basename(file)
> 				size := file:size("{srcdir}/{filename}", host=srchost, provider=provider)
> 
> 				policy := cdm:query(query=file)
> 				log(LOG:DEBUG, "CDM: {file} : {policy}")
> 
> 				doStageinFile(provider=provider, srchost=srchost, srcfile=filename,
> 						srcdir=srcdir, desthost=host, destdir=destdir, size=size, policy=policy)
> 			)
> 			log(LOG:INFO, "END jobid={jobid} - Staging in finished")
> 
> Does this mean that there is actually no throttling going on for
> dostageinfile() ?   It does make sense since my 400k-job workflow is
> still stuck for 5 hours staging in 23k files.
> 
> -Allan
> 
> 
> 2010/11/8 Mihael Hategan <hategan at mcs.anl.gov>:
> > On Mon, 2010-11-08 at 20:50 -0600, Allan Espinosa wrote:
> >> Hi,
> >>
> >> In my workflow, I use the default throttle.transfers=4 .  But my
> >> dostagein-total plot indicates that there are 72 stagein events going
> >> on for around 90 seconds.  shouldn't there be a linear ramp up or a
> >> saw-tooth pattern at the plateau because of having throttled
> >> transfers?
> >
> > Lies. And statistics.
> >
> > The plot indicates that a number of instances of a certain portion of
> > vdl-int is executing.
> >
> > If you look at that portion of vdl-int (i.e. between setprogress("Stage
> > in") and setprogress("Submitting")) there are a few things happening,
> > including directory creation.
> >
> > Essentially you are dealing with the following pattern:
> >
> > parallelFor(...
> >  a()
> >  throttle(4, b())
> >  c()
> > )
> >
> > The graph would show something like the parallelism in the invocation of
> > the body of parallelFor. And it is quite possible that all a()
> > invocations start well before any of the b() invocations start. The only
> > accurate way to see the effect of the throttle is to trace the b()
> > invocations, which you can probably do by looking at the status of file
> > transfer tasks (by enabling the relevant logging stuff).
> >
> > Mihael





More information about the Swift-user mailing list