[Swift-devel] Re: Another performance comparison of DOCK

Sun Apr 13 17:58:55 CDT 2008

Hi, Mike

It is just a typo in the email. I my property file, it is 
"throttle.file.operations=2000". Thanks.

zhao

Michael Wilde wrote:
> >> If its set right, any chance that Swift or Karajan is limiting it
> >> somewhere?
> > 2000 for sure,
> > throttle.submit=off
> > throttle.host.submit=off
> > throttle.score.job.factor=off
> > throttle.transfers=2000
> > throttle.file.operation=2000
>
>
> Looks like a typo in your properties, Zhao - if the text above came 
> from your swift.properties directly:
>
>   throttle.file.operation=2000
>
> vs operations with an s as per the properties doc:
>
> throttle.file.operations=8
> #throttle.file.operations=off
>
> Which doesnt explain why we're seeing 100 when the default is 8 ???
>
> - Mike
>
>
> On 4/13/08 3:39 PM, Zhao Zhang wrote:
>> Hi, Mike
>>
>> Michael Wilde wrote:
>>> Ben, your analysis sounds very good. Some notes below, including 
>>> questions for Zhao.
>>>
>>> On 4/13/08 2:57 PM, Ben Clifford wrote:
>>>>
>>>>> Ben, can you point me to the graphs for this run? (Zhao's 
>>>>> *99cy0z4g.log)
>>>>
>>>> http://www.ci.uchicago.edu/~benc/report-dock2-20080412-1609-99cy0z4g
>>>>
>>>>> Once stage-ins start to complete, are the corresponding jobs 
>>>>> initiated quickly, or is Swift doing mostly stage-ins for some 
>>>>> period?
>>>>
>>>> In the run dock2-20080412-1609-99cy0z4g, jobs are submitted (to 
>>>> falkon) pretty much right as the corresponding stagein completes. I 
>>>> have no deeper information about when the worker actually starts to 
>>>> run.
>>>>
>>>>> Zhao indicated he saw data indicating there was about a 700 second 
>>>>> lag from
>>>>> workflow start time till the first Falkon jobs started, if I 
>>>>> understood
>>>>> correctly. Do the graphs confirm this or say something different?
>>>>
>>>> There is a period of about 500s or so until stuff starts to happen; 
>>>> I haven't looked at it. That is before stage-ins start too, though, 
>>>> which means that i think this...
>>>>
>>>>> If the 700-second delay figure is true, and stage-in was 
>>>>> eliminated by copying
>>>>> input files right to the /tmp workdir rather than first to 
>>>>> /shared, then we'd
>>>>> have:
>>>>>
>>>>> 1190260 / ( 1290 * 2048 ) = .45 efficiency
>>>>
>>>> calculation is not meaningful.
>>>>
>>>> I have not looked at what is going on during that 500s startup 
>>>> time, but I plan to.
>>>
>>> Zhao, what SVN rev is your Swift at?  Ben fixed an N^2 mapper 
>>> logging problem a few weeks ago. Could that cause such a delay, Ben? 
>>> It would be very obvious in the swift log.
>> The version is Swift svn swift-r1780 cog-r1956
>>>
>>>>
>>>>> I assume we're paying the same staging price on the output side?
>>>>
>>>> not really - the output stageouts go very fast, and also because 
>>>> job ending is staggered, they don't happen all at once.
>>>>
>>>> This is the same with most of the large runs I've seen (of any 
>>>> application) - stageout tends not to be a problem (or at least, no 
>>>> where near the problems of stagein).
>>>>
>>>> All stageins happen over a period t=400 to t=1100 fairly smoothly. 
>>>> There's rate limiting still on file operations (100 max) and file 
>>>> transfers (2000 max) which is being hit still.
>>>
>>> I thought Zhao set file operations throttle to 2000 as well.  Sounds 
>>> like we can test with the latter higher, and find out what's 
>>> limiting the former.
>>>
>>> Zhao, what are your settings for property throttle.file.operations?
>>> I assume you have throttle.transfers set to 2000.
>>>
>>> If its set right, any chance that Swift or Karajan is limiting it 
>>> somewhere?
>> 2000 for sure,
>> throttle.submit=off
>> throttle.host.submit=off
>> throttle.score.job.factor=off
>> throttle.transfers=2000
>> throttle.file.operation=2000
>>>>
>>>> I think there's two directions to proceed in here that make sense 
>>>> for actual use on single clusters running falkon (rather than 
>>>> trying to cut out stuff randomly to push up numbers):
>>>>
>>>>  i) use some of the data placement features in falkon, rather than 
>>>> Swift's
>>>>     relatively simple data management that was designed more for 
>>>> running
>>>>     on the grid.
>>>
>>> Long term: we should consider how the Coaster implementation could 
>>> eventually do a similar data placement approach. In the meantime 
>>> (mid term) examining what interface changes are needed for Falkon 
>>> data placement might help prepare for that. Need to discuss if that 
>>> would be a good step or not.
>>>
>>>>
>>>>  ii) do stage-ins using symlinks rather than file copying. this makes
>>>>      sense when everything is living in a single filesystem, which 
>>>> again
>>>>      is not what Swift's data management was originally optimised for.
>>>
>>> I assume you mean symlinks from shared/ back to the user's input files?
>>>
>>> That sounds worth testing: find out if symlink creation is fast on 
>>> NFS and GPFS.
>>>
>>> Is another approach to copy direct from the user's files to the /tmp 
>>> workdir (ie wrapper.sh pulls the data in)? Measurement will tell if 
>>> symlinks alone get adequate performance. Symlinks do seem an easier 
>>> first step.
>>>
>>>> I think option ii) is substantially easier to implement (on the 
>>>> order of days) and is generally useful in the single-cluster, 
>>>> local-source-data situation that appears to be what people want to 
>>>> do for running on the BG/P and scicortex (that is, pretty much 
>>>> ignoring anything grid-like at all).
>>>
>>> Grid-like might mean pulling data to the /tmp workdir directly by 
>>> the wrapper - but that seems like a harder step, and would need 
>>> measurement and prototyping of such code before attempting. Data 
>>> transfer clients that the wrapper script can count on might be an 
>>> obstacle.
>>>
>>>>
>>>> Option i) is much harder (on the order of months), needing a very 
>>>> different interface between Swift and Falkon than exists at the 
>>>> moment.
>>>>
>>>>
>>>>
>>>
>