[petsc-dev] Argonne GPU Virtual Hackathon - Accepted

Barry Smith bsmith at petsc.dev
Fri Mar 12 23:17:08 CST 2021



> On Mar 12, 2021, at 10:49 PM, Jed Brown <jed at jedbrown.org> wrote:
> 
> Barry Smith <bsmith at petsc.dev> writes:
> 
>>> On Mar 12, 2021, at 6:58 PM, Jed Brown <jed at jedbrown.org> wrote:
>>> 
>>> Barry Smith <bsmith at petsc.dev> writes:
>>> 
>>>>     I think we should start porting the PetscFE infrastructure, numerical integrations, vector and matrix assembly to GPUs soon. It is dog slow on CPUs and should be able to deliver higher performance on GPUs. 
>>> 
>>> IMO, this comes via interfaces to libCEED, not rolling yet another way to invoke quadrature routines on GPUs.
>> 
>>   I am not talking about matrix-free stuff, that definitely belongs in libCEED, no reason to rewrite. 
>> 
>>   But does libCEED also support the traditional finite element construction process where the matrices are built explicitly? Or does it provide some of the code, integration points, integration formula etc. that could be shared and used as a starting point? If it includes all of these "traditional" things then we should definitely get it all hooked into PetscFE/DMPLEX and go to town. (But yes not so much need for the GPU hackathon since it is wiring more than GPU code). The way I have always heard about libCEED was as a matrix-free engine, so I may have miss understood. It is definitely not my intention to start a project that reproduces functionality that we can just use. 
> 
> MFEM wants this too and it's in a draft libCEED PR right now. My intent is to ensure it's compatible with Stefano's split-phase COO assembly. 

  Cool, would this be something that, in combination with perhaps some libCEED folk, could be incorporated in the Hackathon? Anyone can join our group Hackathon group, they don't have to have any financial connection with "PETSc". 

> 
>>   We do need solid support for traditional finite element assembly on GPUs, matrix-free finite elements alone is not enough.
> 
> Agreed, and while libCEED could be further optimized for lowest order, even naive assembly will be faster than what's in DMPlex.



More information about the petsc-dev mailing list