[petsc-users] PETSc with modern C++

Michael Povolotskyi mpovolot at purdue.edu
Mon Apr 3 12:11:53 CDT 2017


Hi Filippo,
to recompile Petsc twice is easy.
The difficulty is that in both libraries there will be the same symbols 
for double and double complex functions.
If they were a part of a C++ namespaces, then it would be easier.
Michael.

On 04/03/2017 12:45 PM, Filippo Leonardi wrote:
>
> On Monday, 3 April 2017 02:00:53 CEST you wrote:
>
> > On Sun, Apr 2, 2017 at 2:15 PM, Filippo Leonardi 
> <filippo.leon at gmail.com <mailto:filippo.leon at gmail.com>>
>
> >
>
> > wrote:
>
> > > Hello,
>
> > >
>
> > > I have a project in mind and seek feedback.
>
> > >
>
> > > Disclaimer: I hope I am not abusing of this mailing list with this 
> idea.
>
> > > If so, please ignore.
>
> > >
>
> > > As a thought experiment, and to have a bit of fun, I am currently
>
> > > writing/thinking on writing, a small (modern) C++ wrapper around 
> PETSc.
>
> > >
>
> > > Premise: PETSc is awesome, I love it and use in many projects. 
> Sometimes I
>
> > > am just not super comfortable writing C. (I know my idea goes against
>
> > > PETSc's design philosophy).
>
> > >
>
> > > I know there are many around, and there is not really a need for this
>
> > > (especially since PETSc has his own object-oriented style), but 
> there are
>
> > > a
>
> > > few things I would like to really include in this wrapper, that I 
> found
>
> > > nowhere):
>
> > > - I am currently only thinking about the Vector/Matrix/KSP/DM part 
> of the
>
> > > Framework, there are many other cool things that PETSc does that I 
> do not
>
> > > have the brainpower to consider those as well.
>
> > > - expression templates (in my opinion this is where C++ shines): this
>
> > > would replace all code bloat that a user might need with cool/easy 
> to read
>
> > > expressions (this could increase the number of axpy-like routines);
>
> > > - those expression templates should use SSE and AVX whenever 
> available;
>
> > > - expressions like x += alpha * y should fall back to BLAS axpy (tough
>
> > > sometimes this is not even faster than a simple loop);
>
> >
>
> > The idea for the above is not clear. Do you want templates 
> generating calls
>
> > to BLAS? Or scalar code that operates on raw arrays with SSE/AVX?
>
> > There is some advantage here of expanding the range of BLAS operations,
>
> > which has been done to death by Liz Jessup and collaborators, but not
>
> > that much.
>
>
> Templates should generate scalar code operating on raw arrays using 
> SIMD. But
>
> I can detect if you want to use axpbycz or gemv, and use the blas
>
> implementation instead. I do not think there is a point in trying to 
> "beat"
>
> BLAS. (Here a interesting point opens: I assume an efficient BLAS
>
> implementation, but I am not so sure about how the different BLAS do 
> things
>
> internally. I work from the assumption that we have a very well tuned 
> BLAS
>
> implementation at our disposal).
>
>
> >
>
> > > - all calls to PETSc should be less verbose, more C++-like:
>
> > > * for instance a VecGlobalToLocalBegin could return an empty 
> object that
>
> > >
>
> > > calls VecGlobalToLocalEnd when it is destroyed.
>
> > >
>
> > > * some cool idea to easily write GPU kernels.
>
> >
>
> > If you find a way to make this pay off it would be amazing, since 
> currently
>
> > nothing but BLAS3 has a hope of mattering in this context.
>
> >
>
> > > - the idea would be to have safer routines (at compile time), by 
> means of
>
> > > RAII etc.
>
> > >
>
> > > I aim for zero/near-zero/negligible overhead with full 
> optimization, for
>
> > > that I include benchmarks and extensive test units.
>
> > >
>
> > > So my question is:
>
> > > - anyone that would be interested (in the product/in developing)?
>
> > > - anyone that has suggestions (maybe that what I have in mind is
>
> > > nonsense)?
>
> >
>
> > I would suggest making a simple performance model that says what you 
> will
>
> > do will have at least
>
> > a 2x speed gain. Because anything less is not worth your time, and
>
> > inevitably you will not get the
>
> > whole multiplier. I am really skeptical that is possible with the above
>
> > sketch.
>
>
> That I will do as next steps for sure. But I also doubt this much of 
> will be
>
> achievable in any case.
>
>
> >
>
> > Second, I would try to convince myself that what you propose would be
>
> > simpler, in terms of lines of code,
>
> > number of objects, number of concepts, etc. Right now, that is not 
> clear to
>
> > me either.
>
>
> Number of objects per se may not be smaller. I am more thinking about 
> reducing
>
> lines of codes (verbosity), concepts and increase safety.
>
>
> I have two examples I've been burnt with in the past:
>
> - casting to void* to pass custom contexts to PETSc routines
>
> - forgetting to call the corresponding XXXEnd after a call to XXXBegin
>
> (PETSc notices that, ofc., but at runtime, and that might be too late).
>
>
> Example: I can imagine that I need a Petsc's internal array. In this 
> case I
>
> call VecGetArray. However I will inevitably foget to return the array to
>
> PETSc. I could have my new VecArray returning an object that restores the
>
> array
>
> when it goes out of scope. I can also flag the function with 
> [[nodiscard]] to
>
> prevent the user to destroy the returned object from the start.
>
>
> >
>
> > Baring that, maybe you can argue that new capabilities, such as the type
>
> > flexibility described by Michael, are enabled. That
>
> > would be the most convincing I think.
>
>
> This would be very interesting indeed, but I see only two options:
>
> - recompile PETSc twice
>
> - manually implement all complex routines, which might be to much of a 
> task
>
>
> >
>
> > Thanks,
>
> >
>
> > Matt
>
>
> Thanks for the feedback Matt.
>
>
> >
>
> > If you have read up to here, thanks.
>
>
>
>
> On Mon, 3 Apr 2017 at 02:00 Matthew Knepley <knepley at gmail.com 
> <mailto:knepley at gmail.com>> wrote:
>
>     On Sun, Apr 2, 2017 at 2:15 PM, Filippo Leonardi
>     <filippo.leon at gmail.com <mailto:filippo.leon at gmail.com>> wrote:
>
>
>         Hello,
>
>         I have a project in mind and seek feedback.
>
>         Disclaimer: I hope I am not abusing of this mailing list with
>         this idea. If so, please ignore.
>
>         As a thought experiment, and to have a bit of fun, I am
>         currently writing/thinking on writing, a small (modern) C++
>         wrapper around PETSc.
>
>         Premise: PETSc is awesome, I love it and use in many projects.
>         Sometimes I am just not super comfortable writing C. (I know
>         my idea goes against PETSc's design philosophy).
>
>         I know there are many around, and there is not really a need
>         for this (especially since PETSc has his own object-oriented
>         style), but there are a few things I would like to really
>         include in this wrapper, that I found nowhere):
>         - I am currently only thinking about the Vector/Matrix/KSP/DM
>         part of the Framework, there are many other cool things that
>         PETSc does that I do not have the brainpower to consider those
>         as well.
>         - expression templates (in my opinion this is where C++
>         shines): this would replace all code bloat that a user might
>         need with cool/easy to read expressions (this could increase
>         the number of axpy-like routines);
>         - those expression templates should use SSE and AVX whenever
>         available;
>         - expressions like x += alpha * y should fall back to BLAS
>         axpy (tough sometimes this is not even faster than a simple loop);
>
>
>     The idea for the above is not clear. Do you want templates
>     generating calls to BLAS? Or scalar code that operates on raw
>     arrays with SSE/AVX?
>     There is some advantage here of expanding the range of BLAS
>     operations, which has been done to death by Liz Jessup and
>     collaborators, but not
>     that much.
>
>         - all calls to PETSc should be less verbose, more C++-like:
>           * for instance a VecGlobalToLocalBegin could return an empty
>         object that calls VecGlobalToLocalEnd when it is destroyed.
>           * some cool idea to easily write GPU kernels.
>
>
>     If you find a way to make this pay off it would be amazing, since
>     currently nothing but BLAS3 has a hope of mattering in this context.
>
>         - the idea would be to have safer routines(at compile time),
>         by means of RAII etc.
>
>         I aim for zero/near-zero/negligible overhead with full
>         optimization, for that I include benchmarks and extensive test
>         units.
>
>         So my question is:
>         - anyone that would be interested (in the product/in developing)?
>         - anyone that has suggestions (maybe that what I have in mind
>         is nonsense)?
>
>
>     I would suggest making a simple performance model that says what
>     you will do will have at least
>     a 2x speed gain. Because anything less is not worth your time, and
>     inevitably you will not get the
>     whole multiplier. I am really skeptical that is possible with the
>     above sketch.
>
>     Second, I would try to convince myself that what you propose would
>     be simpler, in terms of lines of code,
>     number of objects, number of concepts, etc. Right now, that is not
>     clear to me either.
>
>     Baring that, maybe you can argue that new capabilities, such as
>     the type flexibility described by Michael, are enabled. That
>     would be the most convincing I think.
>
>       Thanks,
>
>          Matt
>
>         If you have read up to here, thanks.
>
>
>
>
>     -- 
>     What most experimenters take for granted before they begin their
>     experiments is infinitely more interesting than any results to
>     which their experiments lead.
>     -- Norbert Wiener
>

-- 
Michael Povolotskyi, PhD
Research Assistant Professor
Network for Computational Nanotechnology
Hall for Discover and Learning Research, Room 441
207 South Martin Jischke Drive
West Lafayette, IN 47907
Phone (765) 4949396

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170403/5df77c42/attachment-0001.html>


More information about the petsc-users mailing list