[petsc-users] Matvecs and KSPSolves with multiple vectors

Sreeram R Venkat srvenkat at utexas.edu
Thu Dec 14 13:02:04 CST 2023


Hello Pierre,

Thank you for your reply. I tried out the HPDDM CG as you said, and it
seems to be doing the batched solves, but the KSP is not converging due to
a NaN or Inf being generated. I also noticed there are a lot of
host-to-device and device-to-host copies of the matrices (the non-batched
KSP solve did not have any memcopies). I have attached dump.0 again. Could
you please take a look?

Thanks,
Sreeram

On Thu, Dec 14, 2023 at 12:42 AM Pierre Jolivet <pierre at joliv.et> wrote:

> Hello Sreeram,
> KSPCG (PETSc implementation of CG) does not handle solves with multiple
> columns at once.
> There is only a single native PETSc KSP implementation which handles
> solves with multiple columns at once: KSPPREONLY.
> If you use --download-hpddm, you can use a CG (or GMRES, or more advanced
> methods) implementation which handles solves with multiple columns at once
> (via -ksp_type hpddm -ksp_hpddm_type cg or KSPSetType(ksp, KSPHPDDM);
> KSPHPDDMSetType(ksp, KSP_HPDDM_TYPE_CG);).
> I’m the main author of HPDDM, there is preliminary support for device
> matrices, but if it’s not working as intended/not faster than column by
> column, I’d be happy to have a deeper look (maybe in private), because most
> (if not all) of my users interested in (pseudo-)block Krylov solvers (i.e.,
> solvers that treat right-hand sides in a single go) are using plain host
> matrices.
>
> Thanks,
> Pierre
>
> PS: you could have a look at
> https://www.sciencedirect.com/science/article/abs/pii/S0898122121000055 to
> understand the philosophy behind block iterative methods in PETSc (and in
> HPDDM), src/mat/tests/ex237.c, the benchmark I mentioned earlier, was
> developed in the context of this paper to produce Figures 2-3. Note that
> this paper is now slightly outdated, since then, PCHYPRE and PCMG (among
> others) have been made “PCMatApply()-ready”.
>
> On 13 Dec 2023, at 11:05 PM, Sreeram R Venkat <srvenkat at utexas.edu> wrote:
>
> Hello Pierre,
>
> I am trying out the KSPMatSolve with the BoomerAMG preconditioner.
> However, I am noticing that it is still solving column by column (this is
> stated explicitly in the info dump attached). I looked at the code for
> KSPMatSolve_Private() and saw that as long as ksp->ops->matsolve is true,
> it should do the batched solve, though I'm not sure where that gets set.
>
> I am using the options -pc_type hypre -pc_hypre_type boomeramg when
> running the code.
>
> Can you please help me with this?
>
> Thanks,
> Sreeram
>
>
> On Thu, Dec 7, 2023 at 4:04 PM Mark Adams <mfadams at lbl.gov> wrote:
>
>> N.B., AMGX interface is a bit experimental.
>> Mark
>>
>> On Thu, Dec 7, 2023 at 4:11 PM Sreeram R Venkat <srvenkat at utexas.edu>
>> wrote:
>>
>>> Oh, in that case I will try out BoomerAMG. Getting AMGX to build
>>> correctly was also tricky so hopefully the HYPRE build will be easier.
>>>
>>> Thanks,
>>> Sreeram
>>>
>>> On Thu, Dec 7, 2023, 3:03 PM Pierre Jolivet <pierre at joliv.et> wrote:
>>>
>>>>
>>>>
>>>> On 7 Dec 2023, at 9:37 PM, Sreeram R Venkat <srvenkat at utexas.edu>
>>>> wrote:
>>>>
>>>> Thank you Barry and Pierre; I will proceed with the first option.
>>>>
>>>> I want to use the AMGX preconditioner for the KSP. I will try it out
>>>> and see how it performs.
>>>>
>>>>
>>>> Just FYI, AMGX does not handle systems with multiple RHS, and thus has
>>>> no PCMatApply() implementation.
>>>> BoomerAMG does, and there is a PCMatApply_HYPRE_BoomerAMG()
>>>> implementation.
>>>> But let us know if you need assistance figuring things out.
>>>>
>>>> Thanks,
>>>> Pierre
>>>>
>>>> Thanks,
>>>> Sreeram
>>>>
>>>> On Thu, Dec 7, 2023 at 2:02 PM Pierre Jolivet <pierre at joliv.et> wrote:
>>>>
>>>>> To expand on Barry’s answer, we have observed repeatedly that
>>>>> MatMatMult with MatAIJ performs better than MatMult with MatMAIJ, you can
>>>>> reproduce this on your own with
>>>>> https://petsc.org/release/src/mat/tests/ex237.c.html.
>>>>> Also, I’m guessing you are using some sort of preconditioner within
>>>>> your KSP.
>>>>> Not all are “KSPMatSolve-ready”, i.e., they may treat blocks of
>>>>> right-hand sides column by column, which is very inefficient.
>>>>> You could run your code with -info dump and send us dump.0 to see what
>>>>> needs to be done on our end to make things more efficient, should you not
>>>>> be satisfied with the current performance of the code.
>>>>>
>>>>> Thanks,
>>>>> Pierre
>>>>>
>>>>> On 7 Dec 2023, at 8:34 PM, Barry Smith <bsmith at petsc.dev> wrote:
>>>>>
>>>>>
>>>>>
>>>>> On Dec 7, 2023, at 1:17 PM, Sreeram R Venkat <srvenkat at utexas.edu>
>>>>> wrote:
>>>>>
>>>>> I have 2 sequential matrices M and R (both MATSEQAIJCUSPARSE of size n
>>>>> x n) and a vector v of size n*m. v = [v_1 , v_2 ,... , v_m] where v_i has
>>>>> size n. The data for v can be stored either in column-major or row-major
>>>>> order.  Now, I want to do 2 types of operations:
>>>>>
>>>>> 1. Matvecs of the form M*v_i = w_i, for i = 1..m.
>>>>> 2. KSPSolves of the form R*x_i = v_i, for i = 1..m.
>>>>>
>>>>> From what I have read on the documentation, I can think of 2
>>>>> approaches.
>>>>>
>>>>> 1. Get the pointer to the data in v (column-major) and use it to
>>>>> create a dense matrix V. Then do a MatMatMult with M*V = W, and take the
>>>>> data pointer of W to create the vector w. For KSPSolves, use KSPMatSolve
>>>>> with R and V.
>>>>>
>>>>> 2. Create a MATMAIJ using M/R and use that for matvecs directly with
>>>>> the vector v. I don't know if KSPSolve with the MATMAIJ will know that it
>>>>> is a multiple RHS system and act accordingly.
>>>>>
>>>>> Which would be the more efficient option?
>>>>>
>>>>>
>>>>> Use 1.
>>>>>
>>>>>
>>>>> As a side-note, I am also wondering if there is a way to use row-major
>>>>> storage of the vector v.
>>>>>
>>>>>
>>>>> No
>>>>>
>>>>> The reason is that this could allow for more coalesced memory access
>>>>> when doing matvecs.
>>>>>
>>>>>
>>>>>   PETSc matrix-vector products use BLAS GMEV matrix-vector products
>>>>> for the computation so in theory they should already be well-optimized
>>>>>
>>>>>
>>>>> Thanks,
>>>>> Sreeram
>>>>>
>>>>>
>>>>>
>>>> <dump.0>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231214/9200c5ff/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: dump.0
Type: application/octet-stream
Size: 1565937 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231214/9200c5ff/attachment-0001.obj>


More information about the petsc-users mailing list