[Swift-user] throttle transfers and vdl:stagein graphs

Allan Espinosa aespinosa at cs.uchicago.edu
Thu Dec 9 13:12:59 CST 2010


Ahhh,

I changed the throttling parameters and saw the behavior of
vdl:stageinfile change.

Thanks!
-Allan

2010/12/8 Mihael Hategan <hategan at mcs.anl.gov>:
> Throttling happens in the scheduler.
>
> On Wed, 2010-12-08 at 10:44 -0600, Allan Espinosa wrote:
>> I see this in doStagein:
>>
>>                       uParallelFor(file, files
>>                               provider := provider(file)
>>                               srchost := hostname(file)
>>                               srcdir := vdl:dirname(file)
>>                               destdir := dircat(dir, reldirname(file))
>>                               filename := basename(file)
>>                               size := file:size("{srcdir}/{filename}", host=srchost, provider=provider)
>>
>>                               policy := cdm:query(query=file)
>>                               log(LOG:DEBUG, "CDM: {file} : {policy}")
>>
>>                               doStageinFile(provider=provider, srchost=srchost, srcfile=filename,
>>                                               srcdir=srcdir, desthost=host, destdir=destdir, size=size, policy=policy)
>>                       )
>>                       log(LOG:INFO, "END jobid={jobid} - Staging in finished")
>>
>> Does this mean that there is actually no throttling going on for
>> dostageinfile() ?   It does make sense since my 400k-job workflow is
>> still stuck for 5 hours staging in 23k files.
>>
>> -Allan
>>
>>
>> 2010/11/8 Mihael Hategan <hategan at mcs.anl.gov>:
>> > On Mon, 2010-11-08 at 20:50 -0600, Allan Espinosa wrote:
>> >> Hi,
>> >>
>> >> In my workflow, I use the default throttle.transfers=4 .  But my
>> >> dostagein-total plot indicates that there are 72 stagein events going
>> >> on for around 90 seconds.  shouldn't there be a linear ramp up or a
>> >> saw-tooth pattern at the plateau because of having throttled
>> >> transfers?
>> >
>> > Lies. And statistics.
>> >
>> > The plot indicates that a number of instances of a certain portion of
>> > vdl-int is executing.
>> >
>> > If you look at that portion of vdl-int (i.e. between setprogress("Stage
>> > in") and setprogress("Submitting")) there are a few things happening,
>> > including directory creation.
>> >
>> > Essentially you are dealing with the following pattern:
>> >
>> > parallelFor(...
>> >  a()
>> >  throttle(4, b())
>> >  c()
>> > )
>> >
>> > The graph would show something like the parallelism in the invocation of
>> > the body of parallelFor. And it is quite possible that all a()
>> > invocations start well before any of the b() invocations start. The only
>> > accurate way to see the effect of the throttle is to trace the b()
>> > invocations, which you can probably do by looking at the status of file
>> > transfer tasks (by enabling the relevant logging stuff).
>> >
>> > Mihael



More information about the Swift-user mailing list