[petsc-users] PETSc with modern C++
Filippo Leonardi
filippo.leon at gmail.com
Mon Apr 3 11:45:15 CDT 2017
On Monday, 3 April 2017 02:00:53 CEST you wrote:
> On Sun, Apr 2, 2017 at 2:15 PM, Filippo Leonardi <filippo.leon at gmail.com>
>
> wrote:
> > Hello,
> >
> > I have a project in mind and seek feedback.
> >
> > Disclaimer: I hope I am not abusing of this mailing list with this idea.
> > If so, please ignore.
> >
> > As a thought experiment, and to have a bit of fun, I am currently
> > writing/thinking on writing, a small (modern) C++ wrapper around PETSc.
> >
> > Premise: PETSc is awesome, I love it and use in many projects.
Sometimes I
> > am just not super comfortable writing C. (I know my idea goes against
> > PETSc's design philosophy).
> >
> > I know there are many around, and there is not really a need for this
> > (especially since PETSc has his own object-oriented style), but there
are
> > a
> > few things I would like to really include in this wrapper, that I found
> > nowhere):
> > - I am currently only thinking about the Vector/Matrix/KSP/DM part of
the
> > Framework, there are many other cool things that PETSc does that I do
not
> > have the brainpower to consider those as well.
> > - expression templates (in my opinion this is where C++ shines): this
> > would replace all code bloat that a user might need with cool/easy to
read
> > expressions (this could increase the number of axpy-like routines);
> > - those expression templates should use SSE and AVX whenever available;
> > - expressions like x += alpha * y should fall back to BLAS axpy (tough
> > sometimes this is not even faster than a simple loop);
>
> The idea for the above is not clear. Do you want templates generating
calls
> to BLAS? Or scalar code that operates on raw arrays with SSE/AVX?
> There is some advantage here of expanding the range of BLAS operations,
> which has been done to death by Liz Jessup and collaborators, but not
> that much.
Templates should generate scalar code operating on raw arrays using SIMD.
But
I can detect if you want to use axpbycz or gemv, and use the blas
implementation instead. I do not think there is a point in trying to "beat"
BLAS. (Here a interesting point opens: I assume an efficient BLAS
implementation, but I am not so sure about how the different BLAS do things
internally. I work from the assumption that we have a very well tuned BLAS
implementation at our disposal).
>
> > - all calls to PETSc should be less verbose, more C++-like:
> > * for instance a VecGlobalToLocalBegin could return an empty object that
> >
> > calls VecGlobalToLocalEnd when it is destroyed.
> >
> > * some cool idea to easily write GPU kernels.
>
> If you find a way to make this pay off it would be amazing, since
currently
> nothing but BLAS3 has a hope of mattering in this context.
>
> > - the idea would be to have safer routines (at compile time), by means
of
> > RAII etc.
> >
> > I aim for zero/near-zero/negligible overhead with full optimization, for
> > that I include benchmarks and extensive test units.
> >
> > So my question is:
> > - anyone that would be interested (in the product/in developing)?
> > - anyone that has suggestions (maybe that what I have in mind is
> > nonsense)?
>
> I would suggest making a simple performance model that says what you will
> do will have at least
> a 2x speed gain. Because anything less is not worth your time, and
> inevitably you will not get the
> whole multiplier. I am really skeptical that is possible with the above
> sketch.
That I will do as next steps for sure. But I also doubt this much of will
be
achievable in any case.
>
> Second, I would try to convince myself that what you propose would be
> simpler, in terms of lines of code,
> number of objects, number of concepts, etc. Right now, that is not clear
to
> me either.
Number of objects per se may not be smaller. I am more thinking about
reducing
lines of codes (verbosity), concepts and increase safety.
I have two examples I've been burnt with in the past:
- casting to void* to pass custom contexts to PETSc routines
- forgetting to call the corresponding XXXEnd after a call to XXXBegin
(PETSc notices that, ofc., but at runtime, and that might be too late).
Example: I can imagine that I need a Petsc's internal array. In this case I
call VecGetArray. However I will inevitably foget to return the array to
PETSc. I could have my new VecArray returning an object that restores the
array
when it goes out of scope. I can also flag the function with [[nodiscard]]
to
prevent the user to destroy the returned object from the start.
>
> Baring that, maybe you can argue that new capabilities, such as the type
> flexibility described by Michael, are enabled. That
> would be the most convincing I think.
This would be very interesting indeed, but I see only two options:
- recompile PETSc twice
- manually implement all complex routines, which might be to much of a task
>
> Thanks,
>
> Matt
Thanks for the feedback Matt.
>
> If you have read up to here, thanks.
On Mon, 3 Apr 2017 at 02:00 Matthew Knepley <knepley at gmail.com> wrote:
> On Sun, Apr 2, 2017 at 2:15 PM, Filippo Leonardi <filippo.leon at gmail.com>
> wrote:
>
>
> Hello,
>
> I have a project in mind and seek feedback.
>
> Disclaimer: I hope I am not abusing of this mailing list with this idea.
> If so, please ignore.
>
> As a thought experiment, and to have a bit of fun, I am currently
> writing/thinking on writing, a small (modern) C++ wrapper around PETSc.
>
> Premise: PETSc is awesome, I love it and use in many projects. Sometimes I
> am just not super comfortable writing C. (I know my idea goes against
> PETSc's design philosophy).
>
> I know there are many around, and there is not really a need for this
> (especially since PETSc has his own object-oriented style), but there are a
> few things I would like to really include in this wrapper, that I found
> nowhere):
> - I am currently only thinking about the Vector/Matrix/KSP/DM part of the
> Framework, there are many other cool things that PETSc does that I do not
> have the brainpower to consider those as well.
> - expression templates (in my opinion this is where C++ shines): this
> would replace all code bloat that a user might need with cool/easy to read
> expressions (this could increase the number of axpy-like routines);
> - those expression templates should use SSE and AVX whenever available;
> - expressions like x += alpha * y should fall back to BLAS axpy (tough
> sometimes this is not even faster than a simple loop);
>
>
> The idea for the above is not clear. Do you want templates generating
> calls to BLAS? Or scalar code that operates on raw arrays with SSE/AVX?
> There is some advantage here of expanding the range of BLAS operations,
> which has been done to death by Liz Jessup and collaborators, but not
> that much.
>
>
> - all calls to PETSc should be less verbose, more C++-like:
> * for instance a VecGlobalToLocalBegin could return an empty object that
> calls VecGlobalToLocalEnd when it is destroyed.
> * some cool idea to easily write GPU kernels.
>
>
> If you find a way to make this pay off it would be amazing, since
> currently nothing but BLAS3 has a hope of mattering in this context.
>
>
> - the idea would be to have safer routines (at compile time), by means of
> RAII etc.
>
> I aim for zero/near-zero/negligible overhead with full optimization, for
> that I include benchmarks and extensive test units.
>
> So my question is:
> - anyone that would be interested (in the product/in developing)?
> - anyone that has suggestions (maybe that what I have in mind is nonsense)?
>
>
> I would suggest making a simple performance model that says what you will
> do will have at least
> a 2x speed gain. Because anything less is not worth your time, and
> inevitably you will not get the
> whole multiplier. I am really skeptical that is possible with the above
> sketch.
>
> Second, I would try to convince myself that what you propose would be
> simpler, in terms of lines of code,
> number of objects, number of concepts, etc. Right now, that is not clear
> to me either.
>
> Baring that, maybe you can argue that new capabilities, such as the type
> flexibility described by Michael, are enabled. That
> would be the most convincing I think.
>
> Thanks,
>
> Matt
>
> If you have read up to here, thanks.
>
>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170403/94cbede2/attachment-0001.html>
More information about the petsc-users
mailing list