[petsc-dev] Hardware counter logging in PETSc (was Re: Where next with PETSc and KNL?)
bsmith at mcs.anl.gov
Wed Sep 28 16:33:14 CDT 2016
Moving to petsc-dev so everyone can see this discussion.
To get more detailed "performance" information on runs we have two (not necessarily orthogonal) choices.
1) use an integrated system that is independent of PETSc. These sometimes require compiling with additional options and then running a post-processor after the run. These systems then display the results in some kind of GUI. Intel has such a thing, as does Apple. Do they allow logging/display of things we care about such as cache misses, ....? Depends on each system, and some of the systems are improving over time.
2) add additional logging of values into the PETSc logging and then have PetscLogView() process the raw logged values into useful information.
Both approaches have advantages and disadvantages but we do take on a large development and maintenance burden if we try to incorporate more logging directly into PETSc. So what does incorporating into PETSc buy us that is worth the extra hassle? That is can we do something with the "in PETSc" approach we could not achieve otherwise? (I don't thing arguments about it being more portable and not requiring you to buy vtune etc from Intel are enough reason to do the work internally.)
In other words if I am interested in finding out why my MatMult() is slower then I think it should be is it such a terrible thing to have crank up vtune (or similar beast) to get details about the computational phase I am interested in?
You should be able to guess that I am leaning towards 1) and want to know why that is a fatal mistake, if it is?
> On Sep 24, 2016, at 12:00 PM, Richard Tran Mills <richard.t.mills at intel.com> wrote:
> Hi Folks,
> I'm breaking up replies to my long email message into smaller chunks to make it easier to keep track of the discussion. Just address the perf counter issue here.
> On 9/24/16 6:54 AM, Jed Brown wrote:
>> 7) I still think we should add some support for collecting hardware
>>> counter information in the PETSc logging framework. I see that the
>>> latest PAPI release adds some KNL support, though I don't know if it
>>> supports the uncore counters. Anyhow, I should start a thread on
>>> petsc-dev about this...
>> There was some PAPI support once upon a time (before my time), but I
>> think Barry stripped it out because it's crappy software. I haven't
>> seriously looked at using the linux performance counter interface
>> directly, but it would be less to install and not streaked with Dongarra
> An alternative that I came across is something written by some Intel folks, with the terribly generic name of "Intel Performance Counter Monitor". The webpage for it is at
> It provides a simple C++ API (I wish there was a C one; we'd need to wrap things to keep from polluting the PETSc code with C++ stuff) that lets you capture essentially any of the PMU events. This looks a lot nicer than PAPI in several ways, but has the downside of being Intel-specific. I also don't see any KNL-specific counter support yet.
More information about the petsc-dev