[Swift-devel] Swift and BGP plots

Ioan Raicu iraicu at cs.uchicago.edu
Tue Oct 27 00:41:52 CDT 2009


Where can I get the Swift logs for these runs, I'd like to pass them 
through my Swift-to-Falkon conversion tool :)

Ioan

Mihael Hategan wrote:
> On Mon, 2009-10-26 at 23:19 -0500, Ioan Raicu wrote:
>   
>> Mihael Hategan wrote: 
>>     
>>> On Mon, 2009-10-26 at 16:36 -0500, Ioan Raicu wrote:
>>> [...]
>>>       
> [...]
>   
>>>   
>>>       
>> OK. So which plot/data should I be looking at to get the summary of
>> the per tasks performance?
>>     
>
> None. You calculate the efficiency based on the total time, individual
> task time and number of nodes.
>
>   
>>>>> 64k jobs, 4000 workers:
>>>>> http://www.mcs.anl.gov/~hategan/report-dc-4000/
>>>>>   
>>>>>       
>>>>>           
>>>> Shortest event (s): 106.119999885559
>>>> Longest event (s): 1246.60699987411
>>>> Mean event duration (s): 334.987874176266
>>>> Standard deviation of event duration (s): 290.212811366649
>>>>         
> [...]
>   
>> In the case of the above 18%, I took 60 / 334 ~ 0.18 = 18%
>>     
>
> See http://en.wikipedia.org/wiki/Speedup
>
> Efficiency is speedup divided by number of cores. You cannot infer
> things from the mean duration because it says nothing about the degree
> of parallelism. I.e. you're not interested in how fast individual things
> are going, but how fast all things are going overall. That's why you can
> have a crappy-cpu machine like the BGP be the in the top 10. Allow me to
> paint this:
>
> Scenario 1: [-1-][-2-][-1-][-2-]
> Scenario 2: [---1----][---2----]
>
> (how two tasks are scheduled by a scheduler on one CPU).
>
> Efficiency is the same in both cases. In scenario 1 average task
> duration is 15 characters, in scenario 2 it's 10 characters. In both
> cases the raw task duration is 10 characters.
>
> [...]
>   
>>>   
>>>       
>> I see the block utilization near 100% all the time,
>>     
>
> Right. That's because it measures the wrapper time, not the wrapped job
> time (some time is spent doing whatever the wrapper does). That just
> says that the job-to-worker dispatch algorithm in the coasters works ok
> with that load.
>
>   
>>  so that doesn't seem to match the other data I saw.
>>     
>
> They measure different things. But if you calculate the efficiency the
> proper way, you'll see that they are closer.
>
>   
>>> 2. Multiply 60s with the number of jobs (65535), divide by the number of
>>> workers (6*1024) and then by the total time since the first job starts
>>> to when the last job finishes (or you could choose the middle of the
>>> ramp-up to the middle of the ramp-down to get some sort of amortized
>>> efficiency). That gives you about 91% end-to end and 96% amortized. Or
>>> you could divide by the total time, including swift startup, partition
>>> boot time, etc. to get 64%.
>>>   
>>>       
>> 65535*60/(6*1024) ~ 640 sec. I see the end-to-end time being about
>> 1300 sec, or 1100 sec if we look at just Karajan. The 64% efficiency
>> is in the ballpark, but I don't see where the 91% and 96% are coming
>> from. 
>>     
>
> I think you're mixing the runs. Sorry I didn't make it more clear, but
> dc-4000 is the 4*1024 core run and dc-6000 is the 6*1024 core run. So
> you're dividing the 4*1024 core speedup by 6*1024 which gives you 2/3 of
> the efficiency. Multiply 64% by 3/2 to get the proper number back.
>
>
>
>   

-- 
=================================================================
Ioan Raicu, Ph.D.
NSF/CRA Computing Innovation Fellow
=================================================================
Center for Ultra-scale Computing and Information Security (CUCIS)
Department of Electrical Engineering and Computer Science
Northwestern University
2145 Sheridan Rd, Tech M384 
Evanston, IL 60208-3118
=================================================================
Cel:   1-847-722-0876
Tel:   1-847-491-8163
Email: iraicu at eecs.northwestern.edu
Web:   http://www.eecs.northwestern.edu/~iraicu/
       https://wiki.cucis.eecs.northwestern.edu/
=================================================================
=================================================================


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/swift-devel/attachments/20091027/6f4f2213/attachment.html>


More information about the Swift-devel mailing list