[Swift-devel] Re: Another performance comparison of DOCK
Ioan Raicu
iraicu at cs.uchicago.edu
Sun Apr 13 18:22:51 CDT 2008
We are not using GridFTP on the BG/P, where this test was done. Files
are already on GPFS, so the stageins are probably just cp (or ln -s)
from one place to another on GPFS. Is your suggestion still to set that
2000 back down to 100?
Ioan
Mihael Hategan wrote:
> Then my guess is that the system itself (swift + server + FS) cannot
> sustain a much higher rate than 100 things per second. In principle
> setting those throttles to 2000 pretty much means that you're trying to
> start 2000 gridftp connections and hence 2000 gridftp processes on the
> server.
>
> On Sun, 2008-04-13 at 17:58 -0500, Zhao Zhang wrote:
>
>> Hi, Mike
>>
>> It is just a typo in the email. I my property file, it is
>> "throttle.file.operations=2000". Thanks.
>>
>> zhao
>>
>> Michael Wilde wrote:
>>
>>>>> If its set right, any chance that Swift or Karajan is limiting it
>>>>> somewhere?
>>>>>
>>>> 2000 for sure,
>>>> throttle.submit=off
>>>> throttle.host.submit=off
>>>> throttle.score.job.factor=off
>>>> throttle.transfers=2000
>>>> throttle.file.operation=2000
>>>>
>>> Looks like a typo in your properties, Zhao - if the text above came
>>> from your swift.properties directly:
>>>
>>> throttle.file.operation=2000
>>>
>>> vs operations with an s as per the properties doc:
>>>
>>> throttle.file.operations=8
>>> #throttle.file.operations=off
>>>
>>> Which doesnt explain why we're seeing 100 when the default is 8 ???
>>>
>>> - Mike
>>>
>>>
>>> On 4/13/08 3:39 PM, Zhao Zhang wrote:
>>>
>>>> Hi, Mike
>>>>
>>>> Michael Wilde wrote:
>>>>
>>>>> Ben, your analysis sounds very good. Some notes below, including
>>>>> questions for Zhao.
>>>>>
>>>>> On 4/13/08 2:57 PM, Ben Clifford wrote:
>>>>>
>>>>>>> Ben, can you point me to the graphs for this run? (Zhao's
>>>>>>> *99cy0z4g.log)
>>>>>>>
>>>>>> http://www.ci.uchicago.edu/~benc/report-dock2-20080412-1609-99cy0z4g
>>>>>>
>>>>>>
>>>>>>> Once stage-ins start to complete, are the corresponding jobs
>>>>>>> initiated quickly, or is Swift doing mostly stage-ins for some
>>>>>>> period?
>>>>>>>
>>>>>> In the run dock2-20080412-1609-99cy0z4g, jobs are submitted (to
>>>>>> falkon) pretty much right as the corresponding stagein completes. I
>>>>>> have no deeper information about when the worker actually starts to
>>>>>> run.
>>>>>>
>>>>>>
>>>>>>> Zhao indicated he saw data indicating there was about a 700 second
>>>>>>> lag from
>>>>>>> workflow start time till the first Falkon jobs started, if I
>>>>>>> understood
>>>>>>> correctly. Do the graphs confirm this or say something different?
>>>>>>>
>>>>>> There is a period of about 500s or so until stuff starts to happen;
>>>>>> I haven't looked at it. That is before stage-ins start too, though,
>>>>>> which means that i think this...
>>>>>>
>>>>>>
>>>>>>> If the 700-second delay figure is true, and stage-in was
>>>>>>> eliminated by copying
>>>>>>> input files right to the /tmp workdir rather than first to
>>>>>>> /shared, then we'd
>>>>>>> have:
>>>>>>>
>>>>>>> 1190260 / ( 1290 * 2048 ) = .45 efficiency
>>>>>>>
>>>>>> calculation is not meaningful.
>>>>>>
>>>>>> I have not looked at what is going on during that 500s startup
>>>>>> time, but I plan to.
>>>>>>
>>>>> Zhao, what SVN rev is your Swift at? Ben fixed an N^2 mapper
>>>>> logging problem a few weeks ago. Could that cause such a delay, Ben?
>>>>> It would be very obvious in the swift log.
>>>>>
>>>> The version is Swift svn swift-r1780 cog-r1956
>>>>
>>>>>>> I assume we're paying the same staging price on the output side?
>>>>>>>
>>>>>> not really - the output stageouts go very fast, and also because
>>>>>> job ending is staggered, they don't happen all at once.
>>>>>>
>>>>>> This is the same with most of the large runs I've seen (of any
>>>>>> application) - stageout tends not to be a problem (or at least, no
>>>>>> where near the problems of stagein).
>>>>>>
>>>>>> All stageins happen over a period t=400 to t=1100 fairly smoothly.
>>>>>> There's rate limiting still on file operations (100 max) and file
>>>>>> transfers (2000 max) which is being hit still.
>>>>>>
>>>>> I thought Zhao set file operations throttle to 2000 as well. Sounds
>>>>> like we can test with the latter higher, and find out what's
>>>>> limiting the former.
>>>>>
>>>>> Zhao, what are your settings for property throttle.file.operations?
>>>>> I assume you have throttle.transfers set to 2000.
>>>>>
>>>>> If its set right, any chance that Swift or Karajan is limiting it
>>>>> somewhere?
>>>>>
>>>> 2000 for sure,
>>>> throttle.submit=off
>>>> throttle.host.submit=off
>>>> throttle.score.job.factor=off
>>>> throttle.transfers=2000
>>>> throttle.file.operation=2000
>>>>
>>>>>> I think there's two directions to proceed in here that make sense
>>>>>> for actual use on single clusters running falkon (rather than
>>>>>> trying to cut out stuff randomly to push up numbers):
>>>>>>
>>>>>> i) use some of the data placement features in falkon, rather than
>>>>>> Swift's
>>>>>> relatively simple data management that was designed more for
>>>>>> running
>>>>>> on the grid.
>>>>>>
>>>>> Long term: we should consider how the Coaster implementation could
>>>>> eventually do a similar data placement approach. In the meantime
>>>>> (mid term) examining what interface changes are needed for Falkon
>>>>> data placement might help prepare for that. Need to discuss if that
>>>>> would be a good step or not.
>>>>>
>>>>>
>>>>>> ii) do stage-ins using symlinks rather than file copying. this makes
>>>>>> sense when everything is living in a single filesystem, which
>>>>>> again
>>>>>> is not what Swift's data management was originally optimised for.
>>>>>>
>>>>> I assume you mean symlinks from shared/ back to the user's input files?
>>>>>
>>>>> That sounds worth testing: find out if symlink creation is fast on
>>>>> NFS and GPFS.
>>>>>
>>>>> Is another approach to copy direct from the user's files to the /tmp
>>>>> workdir (ie wrapper.sh pulls the data in)? Measurement will tell if
>>>>> symlinks alone get adequate performance. Symlinks do seem an easier
>>>>> first step.
>>>>>
>>>>>
>>>>>> I think option ii) is substantially easier to implement (on the
>>>>>> order of days) and is generally useful in the single-cluster,
>>>>>> local-source-data situation that appears to be what people want to
>>>>>> do for running on the BG/P and scicortex (that is, pretty much
>>>>>> ignoring anything grid-like at all).
>>>>>>
>>>>> Grid-like might mean pulling data to the /tmp workdir directly by
>>>>> the wrapper - but that seems like a harder step, and would need
>>>>> measurement and prototyping of such code before attempting. Data
>>>>> transfer clients that the wrapper script can count on might be an
>>>>> obstacle.
>>>>>
>>>>>
>>>>>> Option i) is much harder (on the order of months), needing a very
>>>>>> different interface between Swift and Falkon than exists at the
>>>>>> moment.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>> _______________________________________________
>> Swift-devel mailing list
>> Swift-devel at ci.uchicago.edu
>> http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
>>
>>
>
> _______________________________________________
> Swift-devel mailing list
> Swift-devel at ci.uchicago.edu
> http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
>
>
--
===================================================
Ioan Raicu
Ph.D. Candidate
===================================================
Distributed Systems Laboratory
Computer Science Department
University of Chicago
1100 E. 58th Street, Ryerson Hall
Chicago, IL 60637
===================================================
Email: iraicu at cs.uchicago.edu
Web: http://www.cs.uchicago.edu/~iraicu
http://dev.globus.org/wiki/Incubator/Falkon
http://dsl-wiki.cs.uchicago.edu/index.php/Main_Page
===================================================
===================================================
More information about the Swift-devel
mailing list