[petsc-users] Dense Matrix Factorization/Solve

Sreeram R Venkat srvenkat at utexas.edu
Wed Jul 24 16:33:49 CDT 2024


Thanks for the suggestions; I will try them out.

Dense factorization is used as the benchmark for Top500 right? That's why I
thought there would be some state-of-the-art multi GPU dense linear solvers
out there.

I saw this library called cuSOLVERMp
https://urldefense.us/v3/__https://docs.nvidia.com/cuda/cusolvermp/__;!!G_uCfscf7eWS!ebpPt6OKSu0Ua8y56LhYJM0ol0OAD-aZ4XGMbFPoIIzc0oNqKZryYg0uIRdhObPv7MOrgO1jJFieu5U2hVjtUcOPaA$  from NVIDIA. It looks somewhat
difficult to integrate with other code, though.

I also found this https://urldefense.us/v3/__https://github.com/nv-legate/cunumeric__;!!G_uCfscf7eWS!ebpPt6OKSu0Ua8y56LhYJM0ol0OAD-aZ4XGMbFPoIIzc0oNqKZryYg0uIRdhObPv7MOrgO1jJFieu5U2hVg0mdmrig$  from NVIDIA which
shows some good results for multi GPU Cholesky, but I'm having some trouble
getting it set up correctly.

On Wed, Jul 24, 2024, 12:08 PM Barry Smith <bsmith at petsc.dev> wrote:

>
>    For one MPI rank, it looks like you can use -pc_type cholesky
> -pc_factor_mat_solver_type cupm though it is not documented in
> https://urldefense.us/v3/__https://petsc.org/release/overview/linear_solve_table/*direct-solvers__;Iw!!G_uCfscf7eWS!ebpPt6OKSu0Ua8y56LhYJM0ol0OAD-aZ4XGMbFPoIIzc0oNqKZryYg0uIRdhObPv7MOrgO1jJFieu5U2hVhs7ez3Mw$ 
>
>    Of if you also ./configure --download-kokkos --download-kokkos-kernels
> you can use -pc_factor_mat_solver_type kokkos if you also this may also
> work for multiple GPUs but that is not documented in the table either
> (Junchao) Nor are sparse Kokkos or CUDA stuff documented (if they exist) in
> the table.
>
>
>    Barry
>
>
>
> On Jul 24, 2024, at 2:44 PM, Sreeram R Venkat <srvenkat at utexas.edu> wrote:
>
> This Message Is From an External Sender
> This message came from outside your organization.
> I have an SPD dense matrix of size NxN, where N can range from 10^4-10^5.
> Are there any Cholesky factorization/solve routines for it in PETSc (or in
> any of the external libraries)? If possible, I want to use GPU acceleration
> with 1 or more GPUs. The matrix type can be MATSEQDENSE/MATMPIDENSE or
> MATSEQDENSECUDA/MATMPIDENSECUDA accordingly. If it is possible to do the
> factorization beforehand and store it to do the triangular solves later,
> that would be great.
>
> Thanks,
> Sreeram
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20240724/de268cdc/attachment-0001.html>


More information about the petsc-users mailing list