[petsc-users] PETSc with modern C++

Matthew Knepley knepley at gmail.com
Sun Apr 2 19:00:53 CDT 2017


On Sun, Apr 2, 2017 at 2:15 PM, Filippo Leonardi <filippo.leon at gmail.com>
wrote:

>
> Hello,
>
> I have a project in mind and seek feedback.
>
> Disclaimer: I hope I am not abusing of this mailing list with this idea.
> If so, please ignore.
>
> As a thought experiment, and to have a bit of fun, I am currently
> writing/thinking on writing, a small (modern) C++ wrapper around PETSc.
>
> Premise: PETSc is awesome, I love it and use in many projects. Sometimes I
> am just not super comfortable writing C. (I know my idea goes against
> PETSc's design philosophy).
>
> I know there are many around, and there is not really a need for this
> (especially since PETSc has his own object-oriented style), but there are a
> few things I would like to really include in this wrapper, that I found
> nowhere):
> - I am currently only thinking about the Vector/Matrix/KSP/DM part of the
> Framework, there are many other cool things that PETSc does that I do not
> have the brainpower to consider those as well.
> - expression templates (in my opinion this is where C++ shines): this
> would replace all code bloat that a user might need with cool/easy to read
> expressions (this could increase the number of axpy-like routines);
> - those expression templates should use SSE and AVX whenever available;
> - expressions like x += alpha * y should fall back to BLAS axpy (tough
> sometimes this is not even faster than a simple loop);
>

The idea for the above is not clear. Do you want templates generating calls
to BLAS? Or scalar code that operates on raw arrays with SSE/AVX?
There is some advantage here of expanding the range of BLAS operations,
which has been done to death by Liz Jessup and collaborators, but not
that much.


> - all calls to PETSc should be less verbose, more C++-like:
>   * for instance a VecGlobalToLocalBegin could return an empty object that
> calls VecGlobalToLocalEnd when it is destroyed.
>   * some cool idea to easily write GPU kernels.
>

If you find a way to make this pay off it would be amazing, since currently
nothing but BLAS3 has a hope of mattering in this context.


> - the idea would be to have safer routines (at compile time), by means of
> RAII etc.
>
> I aim for zero/near-zero/negligible overhead with full optimization, for
> that I include benchmarks and extensive test units.
>
> So my question is:
> - anyone that would be interested (in the product/in developing)?
> - anyone that has suggestions (maybe that what I have in mind is nonsense)?
>

I would suggest making a simple performance model that says what you will
do will have at least
a 2x speed gain. Because anything less is not worth your time, and
inevitably you will not get the
whole multiplier. I am really skeptical that is possible with the above
sketch.

Second, I would try to convince myself that what you propose would be
simpler, in terms of lines of code,
number of objects, number of concepts, etc. Right now, that is not clear to
me either.

Baring that, maybe you can argue that new capabilities, such as the type
flexibility described by Michael, are enabled. That
would be the most convincing I think.

  Thanks,

     Matt

If you have read up to here, thanks.
>



-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170402/e1698e12/attachment.html>


More information about the petsc-users mailing list