[petsc-users] KSP on GPU

Stefano Zampini stefano.zampini at gmail.com
Tue Nov 1 11:11:23 CDT 2022


Are you calling VecRestoreArray when you are done inserting the values?

On Tue, Nov 1, 2022, 18:42 Carl-Johan Thore via petsc-users <
petsc-users at mcs.anl.gov> wrote:

> Thanks for the tips!
>
>
>
> The suggested settings for GAMG did not yield better results,
>
> but hypre worked well right away, giving very good convergence!
>
>
>
> A follow-up question then (I hope that’s ok; and it could be related to
> GAMG
>
> not working, I’ll check that). Once everything was running I discovered
> that my gradient vector
>
> dfdx which I populate via an array df obtained from VecGetArray(dfdx, &df)
> doesn’t get
>
> filled properly; it always contains only zeros. This is not the case when
> I run on the CPU,
>
> and df gets filled as it should even on the GPU, suggesting that either
> I’m not using
>
> VecGetArray properly, or I shouldn’t use it at all for GPU computations?
>
>
>
> Kind regards,
>
> Carl-Johan
>
>
>
> *From:* Mark Adams <mfadams at lbl.gov>
> *Sent:* den 31 oktober 2022 13:30
> *To:* Carl-Johan Thore <carl-johan.thore at liu.se>
> *Cc:* Matthew Knepley <knepley at gmail.com>; Barry Smith <bsmith at petsc.dev>;
> petsc-users at mcs.anl.gov
> *Subject:* Re: [petsc-users] KSP on GPU
>
>
>
> * You could try hypre or another preconditioner that you can afford,
> like LU or ASM, that works.
>
> * If this matrix is SPD, you want to use
> -fieldsplit_0_pc_gamg_esteig_ksp_type cg
> -fieldsplit_0_pc_gamg_esteig_ksp_max_it 10
>
>  These will give better eigen estimates, and that is important.
>
> The differences between these steimates is not too bad.
>
> There is a safety factor (1.05 is the default) that you could increase
> with: -fieldsplit_0_mg_levels_ksp_chebyshev_esteig 0,0.05,0,*1.1*
>
> * Finally you could try -fieldsplit_0_pc_gamg_reuse_interpolation 1, if
> GAMG is still not working.
>
>
>
> Use -fieldsplit_0_ksp_converged_reason and check the iteration count.
>
> And it is a good idea to check with hypre to make sure something is not
> going badly in terms of performance anyway. AMG is hard and hypre is a good
> solver.
>
>
>
> Mark
>
>
>
> On Mon, Oct 31, 2022 at 1:56 AM Carl-Johan Thore via petsc-users <
> petsc-users at mcs.anl.gov> wrote:
>
> The GPU supports double precision and I didn’t explicitly tell PETSc to
> use float when compiling, so
>
> I guess it uses double? What’s the easiest way to check?
>
>
>
> Barry, running -ksp_view shows that the solver options are the same for
> CPU and GPU. The only
>
> difference is the coarse grid solver for gamg (“the package used to
> perform factorization:”) which
>
> is petsc for CPU and cusparse for GPU. I tried forcing the GPU to use
> petsc via
>
> -fieldsplit_0_mg_coarse_sub_pc_factor_mat_solver_type, but then ksp failed
> to converge
>
> even on the first topology optimization iteration.
>
>
>
> -ksp_view also shows differences in the eigenvalues from the Chebyshev
> smoother. For example,
>
>
>
> GPU:
>
>    Down solver (pre-smoother) on level 2 -------------------------------
>
>           KSP Object: (fieldsplit_0_mg_levels_2_) 1 MPI process
>
>             type: chebyshev
>
>               eigenvalue targets used: min 0.109245, max 1.2017
>
>               eigenvalues provided (min 0.889134, max 1.09245) with
>
>
>
> CPU:
>
>               eigenvalue targets used: min 0.112623, max 1.23886
>
>               eigenvalues provided (min 0.879582, max 1.12623)
>
>
>
> But I guess such differences are expected?
>
>
>
> /Carl-Johan
>
>
>
> *From:* Matthew Knepley <knepley at gmail.com>
> *Sent:* den 30 oktober 2022 22:00
> *To:* Barry Smith <bsmith at petsc.dev>
> *Cc:* Carl-Johan Thore <carl-johan.thore at liu.se>; petsc-users at mcs.anl.gov
> *Subject:* Re: [petsc-users] KSP on GPU
>
>
>
> On Sun, Oct 30, 2022 at 3:52 PM Barry Smith <bsmith at petsc.dev> wrote:
>
>
>
>    In general you should expect similar but not identical conference
> behavior.
>
>
>
>     I suggest running with all the monitoring you can.
> -ksp_monitor_true_residual
> -fieldsplit_0_monitor_true_residual -fieldsplit_1_monitor_true_residual and
> compare the various convergence between the CPU and GPU. Also run with
> -ksp_view and check that the various solver options are the same (they
> should be).
>
>
>
> Is the GPU using float or double?
>
>
>
>    Matt
>
>
>
>   Barry
>
>
>
>
>
> On Oct 30, 2022, at 11:02 AM, Carl-Johan Thore via petsc-users <
> petsc-users at mcs.anl.gov> wrote:
>
>
>
> Hi,
>
>
>
> I'm solving a topology optimization problem with Stokes flow discretized
> by a stabilized Q1-Q0 finite element method
>
> and using BiCGStab with the fieldsplit preconditioner to solve the linear
> systems. The implementation
>
> is based on DMStag, runs on Ubuntu via WSL2, and works fine with
> PETSc-3.18.1 on multiple CPU cores and the following
>
> options for the preconditioner:
>
>
>
> -fieldsplit_0_ksp_type preonly \
>
> -fieldsplit_0_pc_type gamg \
>
> -fieldsplit_0_pc_gamg_reuse_interpolation 0 \
>
> -fieldsplit_1_ksp_type preonly \
>
> -fieldsplit_1_pc_type jacobi
>
>
>
> However, when I enable GPU computations by adding two options -
>
>
>
> ...
>
> -dm_vec_type cuda \
>
> -dm_mat_type aijcusparse \
>
> -fieldsplit_0_ksp_type preonly \
>
> -fieldsplit_0_pc_type gamg \
>
> -fieldsplit_0_pc_gamg_reuse_interpolation 0 \
>
> -fieldsplit_1_ksp_type preonly \
>
> -fieldsplit_1_pc_type jacobi
>
>
>
> - KSP still works fine the first couple of topology optimization
> iterations but then
>
> stops with "Linear solve did not converge due to DIVERGED_DTOL ..".
>
>
>
> My question is whether I should expect the GPU versions of the linear
> solvers and pre-conditioners
>
> to function exactly as their CPU counterparts (I got this impression from
> the documentation),
>
> in which case I've probably made some mistake in my own code, or whether
> there are other/additional
>
> settings or modifications I should use to run on the GPU (an NVIDIA Quadro
> T2000)?
>
>
>
> Kind regards,
>
>
>
> Carl-Johan
>
>
>
>
>
>
> --
>
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
>
>
> https://www.cse.buffalo.edu/~knepley/
> <https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.cse.buffalo.edu%2F~knepley%2F&data=05%7C01%7Ccarl-johan.thore%40liu.se%7C88e88d9edbf5436ced1c08dabb3bba29%7C913f18ec7f264c5fa816784fe9a58edd%7C0%7C0%7C638028162449727533%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=E403%2B2DscOzU8UM2HMxEp5iVR3HOl3Atv4o6CF1WGQg%3D&reserved=0>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20221101/69a4ee72/attachment-0001.html>


More information about the petsc-users mailing list