[petsc-dev] Why does Kernel_A_gets_inverse_A_5() have a different interface than all the others?
Barry Smith
bsmith at mcs.anl.gov
Thu Sep 22 09:58:06 CDT 2011
On Sep 22, 2011, at 9:49 AM, Jed Brown wrote:
> On Thu, Sep 22, 2011 at 15:24, Barry Smith <bsmith at mcs.anl.gov> wrote:
> You need a better system for mailing this fragment above. It should contain a web link where I can click on it and go straight to the change set in my browser instead of spending 1/2 an hour fucking around with 15256:4a570cebd663 and trying to see what it is. Don't you know that all us folks under 20 can only click on links?
>
> Okay, "hg log -vpr 4a570ce" or http://petsc.cs.iit.edu/petsc/petsc-dev/rev/4a570ce .
Much better
>
>
> >
> > Was your plan to extract the work space for all the variants?
>
> Likely eventually, my concern was that work array having to be arranged 1000s of times, once for each function call and I was trying to get the best possible performance for block size 5. But looking at it now, maybe the better thing to do is to inline all of these functions, then maybe the compiler will be smart enough to keep the same work array for all calls, and we won't have to worry about passing in the work space??? Other ways to get the best performance???
>
> I think actual inlining is probably bad for all sizes greater than 4. (And it's better to use an explicit formula for the inverse than to do Gaussian elimination for those small sizes.) Do we know that managing the work space is in any way significant. It's not even pre-initialized, so I would expect it to literally costs zero without passing it in (it's stack allocated with constant size, so all that changes are some numeric literal offsets within the code).
No we do not know. One could try timings with and without on a variety of systems to determine statistically if the preallocation pays off. Or one can randomly write code like I did :-)
So I have no major objection to you throwing it away but I would like to see at least one run on one system showing that throwing it away makes no measurable change.
Barry
More information about the petsc-dev
mailing list