[petsc-dev] Matshell with PETSs solvers using GPU

Wed Dec 13 09:51:27 CST 2023

@Jed: thank you for your answer!
@Barry: yes, I am thinking on CUDA Fortran.

Thank you,
-Han

> On Dec 12, 2023, at 6:41 PM, Barry Smith <bsmith at petsc.dev> wrote:
> 
> 
>   Are you thinking CUDA Fortran or some other "Fortran but running on the GPU"?
> 
> 
>> On Dec 12, 2023, at 8:11 PM, Jed Brown <jed at jedbrown.org> wrote:
>> 
>> Han Tran <hantran at cs.utah.edu> writes:
>> 
>>> Hi Jed,
>>> 
>>> Thank you for your answer. I have not had a chance to work on this since I asked. I have some follow-up questions.
>>> 
>>> (1) From the Petsc manual, https://petsc.org/release/manualpages/Vec/VecGetArrayAndMemType/, it shows that both VecGetArrayAndMemType() and VecGetArrayReadAndMemType() do not have Fortran support. Has Petsc added the Fortran support for these functions so far?
>> 
>> It doesn't look like it. I think one would mirror the VecGetArrayF90 implementation. 
>> 
>>> (2) My user-defined MatMult(A, u, v) already handles the communication, i.e., the return vector v=A*u was computed with all needed communication. Thus, I do not quite understand when you said that “use DMGlobalToLocalBegin/End()…”. Do I need this function even if the communication is already done by my user-defined MatMult()?
>> 
>> If you do your own communication, you don't need to use DMGlobalToLocalBegin/End.
>> 
>>> 
>>> Thank you,
>>> Han
>>> 
>>>> On Nov 4, 2022, at 9:06 AM, Jed Brown <jed at jedbrown.org> wrote:
>>>> 
>>>> Yes, this is supported. You can use VecGetArrayAndMemType() to get access to device memory. You'll often use DMGlobalToLocalBegin/End() or VecScatter to communicate, but that will use GPU-aware MPI if your Vec is a device vector.
>>>> 
>>>> Han Tran <hantran at cs.utah.edu> writes:
>>>> 
>>>>> Hi,
>>>>> 
>>>>> I am aware that PETSc recently supports solvers on GPU. I wonder whether PETSc supports MatShell with GPU solvers, i.e., I have a user-defined MatMult() function residing on the device, and I want to use MatShell directly with PETSc GPU solvers without any transfer back and forth between host and device. If this is possible, could you let me know how to do this (an example, if any, would be very appreciated)?
>>>>> 
>>>>> Thank you!
>>>>> Han