[petsc-dev] Argonne GPU Virtual Hackathon - Accepted

Sat Mar 13 03:04:34 CST 2021

Another thing perhaps of interest is the stencil-based GPU matrix assembly functionality that Mark introduced.

> Am 13.03.2021 um 07:59 schrieb Stefano Zampini <stefano.zampini at gmail.com>:
> 
> The COO assembly is entirely based on thrust primitives, I don’t have much experience to say we will get a serious speedup by writing our own kernels, but it is definitely worth a try if we will end up adopting COO as entry point for GPU irregular assembly.
> Jed, you mentioned BDDC deluxe, what do you mean by that? Porting setup/application of deluxe scaling onto GPU?
> 
> Timings are not so bad for me joining the hackaton. 
> 
>> On Mar 13, 2021, at 8:17 AM, Barry Smith <bsmith at petsc.dev <mailto:bsmith at petsc.dev>> wrote:
>> 
>> 
>> 
>>> On Mar 12, 2021, at 10:49 PM, Jed Brown <jed at jedbrown.org <mailto:jed at jedbrown.org>> wrote:
>>> 
>>> Barry Smith <bsmith at petsc.dev <mailto:bsmith at petsc.dev>> writes:
>>> 
>>>>> On Mar 12, 2021, at 6:58 PM, Jed Brown <jed at jedbrown.org <mailto:jed at jedbrown.org>> wrote:
>>>>> 
>>>>> Barry Smith <bsmith at petsc.dev <mailto:bsmith at petsc.dev>> writes:
>>>>> 
>>>>>>    I think we should start porting the PetscFE infrastructure, numerical integrations, vector and matrix assembly to GPUs soon. It is dog slow on CPUs and should be able to deliver higher performance on GPUs. 
>>>>> 
>>>>> IMO, this comes via interfaces to libCEED, not rolling yet another way to invoke quadrature routines on GPUs.
>>>> 
>>>>  I am not talking about matrix-free stuff, that definitely belongs in libCEED, no reason to rewrite. 
>>>> 
>>>>  But does libCEED also support the traditional finite element construction process where the matrices are built explicitly? Or does it provide some of the code, integration points, integration formula etc. that could be shared and used as a starting point? If it includes all of these "traditional" things then we should definitely get it all hooked into PetscFE/DMPLEX and go to town. (But yes not so much need for the GPU hackathon since it is wiring more than GPU code). The way I have always heard about libCEED was as a matrix-free engine, so I may have miss understood. It is definitely not my intention to start a project that reproduces functionality that we can just use. 
>>> 
>>> MFEM wants this too and it's in a draft libCEED PR right now. My intent is to ensure it's compatible with Stefano's split-phase COO assembly. 
>> 
>>  Cool, would this be something that, in combination with perhaps some libCEED folk, could be incorporated in the Hackathon? Anyone can join our group Hackathon group, they don't have to have any financial connection with "PETSc". 
>> 
>>> 
>>>>  We do need solid support for traditional finite element assembly on GPUs, matrix-free finite elements alone is not enough.
>>> 
>>> Agreed, and while libCEED could be further optimized for lowest order, even naive assembly will be faster than what's in DMPlex.
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20210313/6f99d8ac/attachment.html>