[petsc-users] Dense Matrix Factorization/Solve

Barry Smith bsmith at petsc.dev
Wed Jul 24 19:07:12 CDT 2024



> On Jul 24, 2024, at 5:33 PM, Sreeram R Venkat <srvenkat at utexas.edu> wrote:
> 
> Thanks for the suggestions; I will try them out.
> 
> Dense factorization is used as the benchmark for Top500 right? That's why I thought there would be some state-of-the-art multi GPU dense linear solvers out there.
> 
> I saw this library called cuSOLVERMp https://urldefense.us/v3/__https://docs.nvidia.com/cuda/cusolvermp/__;!!G_uCfscf7eWS!fcn2UKnRziZm0rP7CEBwWeaeUaiRcgQDOKZWgikZt6UgU6FW640vVQ3rGtF-f3-0f1PZMImxSVNZzTtK5aVt4mw$  from NVIDIA. It looks somewhat difficult to integrate with other code, though.

   The PETSc Scalapack interface could possibly be jiggered to get something to work with cusolvermp since their API's are similar.
> 
> I also found this https://urldefense.us/v3/__https://github.com/nv-legate/cunumeric__;!!G_uCfscf7eWS!fcn2UKnRziZm0rP7CEBwWeaeUaiRcgQDOKZWgikZt6UgU6FW640vVQ3rGtF-f3-0f1PZMImxSVNZzTtKep1PYdI$  from NVIDIA which shows some good results for multi GPU Cholesky, but I'm having some trouble getting it set up correctly.
> 
> On Wed, Jul 24, 2024, 12:08 PM Barry Smith <bsmith at petsc.dev <mailto:bsmith at petsc.dev>> wrote:
>> 
>>    For one MPI rank, it looks like you can use -pc_type cholesky -pc_factor_mat_solver_type cupm though it is not documented in https://urldefense.us/v3/__https://petsc.org/release/overview/linear_solve_table/*direct-solvers__;Iw!!G_uCfscf7eWS!fcn2UKnRziZm0rP7CEBwWeaeUaiRcgQDOKZWgikZt6UgU6FW640vVQ3rGtF-f3-0f1PZMImxSVNZzTtKXNY00vs$ 
>> 
>>    Of if you also ./configure --download-kokkos --download-kokkos-kernels you can use -pc_factor_mat_solver_type kokkos if you also this may also work for multiple GPUs but that is not documented in the table either (Junchao) Nor are sparse Kokkos or CUDA stuff documented (if they exist) in the table.
>> 
>> 
>>    Barry
>> 
>> 
>> 
>>> On Jul 24, 2024, at 2:44 PM, Sreeram R Venkat <srvenkat at utexas.edu <mailto:srvenkat at utexas.edu>> wrote:
>>> 
>>> This Message Is From an External Sender
>>> This message came from outside your organization.
>>> I have an SPD dense matrix of size NxN, where N can range from 10^4-10^5. Are there any Cholesky factorization/solve routines for it in PETSc (or in any of the external libraries)? If possible, I want to use GPU acceleration with 1 or more GPUs. The matrix type can be MATSEQDENSE/MATMPIDENSE or MATSEQDENSECUDA/MATMPIDENSECUDA accordingly. If it is possible to do the factorization beforehand and store it to do the triangular solves later, that would be great.
>>> 
>>> Thanks,
>>> Sreeram
>> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20240724/35f5f4b9/attachment-0001.html>


More information about the petsc-users mailing list