[Darshan-users] Computation of CP_ACCESS[1-4]_ACCESS and other histogram counters

Phil Carns carns at mcs.anl.gov
Thu Feb 27 11:17:29 CST 2014


On 02/27/2014 12:04 PM, Latham, Robert J. wrote:
> On Thu, 2014-02-27 at 14:30 +0100, Matthieu Dorier wrote:
>> Hi,
>>
>> Simple questions out of curiosity:
>> I see some counters like, for instance, CP_ACCESS[1-4]_ACCESS (described as "4 most common access sizes"). Does it mean that for each file accessed, Darshan will keep in memory a full histogram of all the access sizes until the end of the program, to be able to get the most frequent ones when writing the log file? If so, isn't it memory-consuming in case of a large number of accesses with different sizes? Besides, why 4? Is it motivated by some analysis that showed 4 to be good enough for most applications?
>>
> Darshan does keep a histogram, as you've seen, but the space for that
> histogram is bounded by the number of buckets -- and the buckets are
> fixed.
>
> The ACCESS counters come from simply a Most Frequently Used list with
> four slots.  Again, bounded by the number of slots no matter how crazy
> the access pattern might be.
>

At runtime each process actually uses a tree data structure to track the 
most frequent access and stride sizes- we track those particular values 
specifically rather than using a histogram.  The tree for each is capped 
at 32 elements.  At finalize time it then picks the top 4 to store in 
the counters.  If the file is shared across all processes then a 
reduction operator continues to do its best to keep the top 4 as file is 
reduced to a single record.

The good news is that the memory consumption is bounded.  The down side 
is that in the degenerate case (an application that uses more than 32 
access sizes per file, or a shared file in which each rank used entirely 
different access sizes) then those particular counters might be 
under-reported.  In practice we haven't really seen that happen, though.

The "most frequently used" counters are recorded independently of the 
histogram fields.  The histograms should never be misreported, even in 
extreme cases.  We can retain all of the histogram bins all the way 
through runtime and reduction.

I don't think we have any formal rationale for why we chose to report 4, 
we just picked it :)

thanks,
-Phil


More information about the Darshan-users mailing list