[petsc-users] Matvecs and KSPSolves with multiple vectors

Thu Dec 14 16:45:00 CST 2023

Thanks, I will try to create a minimal reproducible example. This may take
me some time though, as I need to figure out how to extract only the
relevant parts (the full program this solve is used in is getting quite
complex).

I'll also try out some of the BoomerAMG options to see if that helps.

Thanks,
Sreeram

On Thu, Dec 14, 2023, 1:12 PM Pierre Jolivet <pierre at joliv.et> wrote:

>
>
> On 14 Dec 2023, at 8:02 PM, Sreeram R Venkat <srvenkat at utexas.edu> wrote:
>
> Hello Pierre,
>
> Thank you for your reply. I tried out the HPDDM CG as you said, and it
> seems to be doing the batched solves, but the KSP is not converging due to
> a NaN or Inf being generated. I also noticed there are a lot of
> host-to-device and device-to-host copies of the matrices (the non-batched
> KSP solve did not have any memcopies). I have attached dump.0 again. Could
> you please take a look?
>
>
> Yes, but you’d need to send me something I can run with your set of
> options (if you are more confident doing this in private, you can remove
> the list from c/c).
> Not all BoomerAMG smoothers handle blocks of right-hand sides, and there
> is not much error checking, so instead of erroring out, this may be the
> reason why you are getting garbage.
>
> Thanks,
> Pierre
>
> Thanks,
> Sreeram
>
> On Thu, Dec 14, 2023 at 12:42 AM Pierre Jolivet <pierre at joliv.et> wrote:
>
>> Hello Sreeram,
>> KSPCG (PETSc implementation of CG) does not handle solves with multiple
>> columns at once.
>> There is only a single native PETSc KSP implementation which handles
>> solves with multiple columns at once: KSPPREONLY.
>> If you use --download-hpddm, you can use a CG (or GMRES, or more advanced
>> methods) implementation which handles solves with multiple columns at once
>> (via -ksp_type hpddm -ksp_hpddm_type cg or KSPSetType(ksp, KSPHPDDM);
>> KSPHPDDMSetType(ksp, KSP_HPDDM_TYPE_CG);).
>> I’m the main author of HPDDM, there is preliminary support for device
>> matrices, but if it’s not working as intended/not faster than column by
>> column, I’d be happy to have a deeper look (maybe in private), because most
>> (if not all) of my users interested in (pseudo-)block Krylov solvers (i.e.,
>> solvers that treat right-hand sides in a single go) are using plain host
>> matrices.
>>
>> Thanks,
>> Pierre
>>
>> PS: you could have a look at
>> https://www.sciencedirect.com/science/article/abs/pii/S0898122121000055 to
>> understand the philosophy behind block iterative methods in PETSc (and in
>> HPDDM), src/mat/tests/ex237.c, the benchmark I mentioned earlier, was
>> developed in the context of this paper to produce Figures 2-3. Note that
>> this paper is now slightly outdated, since then, PCHYPRE and PCMG (among
>> others) have been made “PCMatApply()-ready”.
>>
>> On 13 Dec 2023, at 11:05 PM, Sreeram R Venkat <srvenkat at utexas.edu>
>> wrote:
>>
>> Hello Pierre,
>>
>> I am trying out the KSPMatSolve with the BoomerAMG preconditioner.
>> However, I am noticing that it is still solving column by column (this is
>> stated explicitly in the info dump attached). I looked at the code for
>> KSPMatSolve_Private() and saw that as long as ksp->ops->matsolve is
>> true, it should do the batched solve, though I'm not sure where that gets
>> set.
>>
>> I am using the options -pc_type hypre -pc_hypre_type boomeramg when
>> running the code.
>>
>> Can you please help me with this?
>>
>> Thanks,
>> Sreeram
>>
>>
>> On Thu, Dec 7, 2023 at 4:04 PM Mark Adams <mfadams at lbl.gov> wrote:
>>
>>> N.B., AMGX interface is a bit experimental.
>>> Mark
>>>
>>> On Thu, Dec 7, 2023 at 4:11 PM Sreeram R Venkat <srvenkat at utexas.edu>
>>> wrote:
>>>
>>>> Oh, in that case I will try out BoomerAMG. Getting AMGX to build
>>>> correctly was also tricky so hopefully the HYPRE build will be easier.
>>>>
>>>> Thanks,
>>>> Sreeram
>>>>
>>>> On Thu, Dec 7, 2023, 3:03 PM Pierre Jolivet <pierre at joliv.et> wrote:
>>>>
>>>>>
>>>>>
>>>>> On 7 Dec 2023, at 9:37 PM, Sreeram R Venkat <srvenkat at utexas.edu>
>>>>> wrote:
>>>>>
>>>>> Thank you Barry and Pierre; I will proceed with the first option.
>>>>>
>>>>> I want to use the AMGX preconditioner for the KSP. I will try it out
>>>>> and see how it performs.
>>>>>
>>>>>
>>>>> Just FYI, AMGX does not handle systems with multiple RHS, and thus has
>>>>> no PCMatApply() implementation.
>>>>> BoomerAMG does, and there is a PCMatApply_HYPRE_BoomerAMG()
>>>>> implementation.
>>>>> But let us know if you need assistance figuring things out.
>>>>>
>>>>> Thanks,
>>>>> Pierre
>>>>>
>>>>> Thanks,
>>>>> Sreeram
>>>>>
>>>>> On Thu, Dec 7, 2023 at 2:02 PM Pierre Jolivet <pierre at joliv.et> wrote:
>>>>>
>>>>>> To expand on Barry’s answer, we have observed repeatedly that
>>>>>> MatMatMult with MatAIJ performs better than MatMult with MatMAIJ, you can
>>>>>> reproduce this on your own with
>>>>>> https://petsc.org/release/src/mat/tests/ex237.c.html.
>>>>>> Also, I’m guessing you are using some sort of preconditioner within
>>>>>> your KSP.
>>>>>> Not all are “KSPMatSolve-ready”, i.e., they may treat blocks of
>>>>>> right-hand sides column by column, which is very inefficient.
>>>>>> You could run your code with -info dump and send us dump.0 to see
>>>>>> what needs to be done on our end to make things more efficient, should you
>>>>>> not be satisfied with the current performance of the code.
>>>>>>
>>>>>> Thanks,
>>>>>> Pierre
>>>>>>
>>>>>> On 7 Dec 2023, at 8:34 PM, Barry Smith <bsmith at petsc.dev> wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Dec 7, 2023, at 1:17 PM, Sreeram R Venkat <srvenkat at utexas.edu>
>>>>>> wrote:
>>>>>>
>>>>>> I have 2 sequential matrices M and R (both MATSEQAIJCUSPARSE of size
>>>>>> n x n) and a vector v of size n*m. v = [v_1 , v_2 ,... , v_m] where v_i has
>>>>>> size n. The data for v can be stored either in column-major or row-major
>>>>>> order.  Now, I want to do 2 types of operations:
>>>>>>
>>>>>> 1. Matvecs of the form M*v_i = w_i, for i = 1..m.
>>>>>> 2. KSPSolves of the form R*x_i = v_i, for i = 1..m.
>>>>>>
>>>>>> From what I have read on the documentation, I can think of 2
>>>>>> approaches.
>>>>>>
>>>>>> 1. Get the pointer to the data in v (column-major) and use it to
>>>>>> create a dense matrix V. Then do a MatMatMult with M*V = W, and take the
>>>>>> data pointer of W to create the vector w. For KSPSolves, use KSPMatSolve
>>>>>> with R and V.
>>>>>>
>>>>>> 2. Create a MATMAIJ using M/R and use that for matvecs directly with
>>>>>> the vector v. I don't know if KSPSolve with the MATMAIJ will know that it
>>>>>> is a multiple RHS system and act accordingly.
>>>>>>
>>>>>> Which would be the more efficient option?
>>>>>>
>>>>>>
>>>>>> Use 1.
>>>>>>
>>>>>>
>>>>>> As a side-note, I am also wondering if there is a way to use
>>>>>> row-major storage of the vector v.
>>>>>>
>>>>>>
>>>>>> No
>>>>>>
>>>>>> The reason is that this could allow for more coalesced memory access
>>>>>> when doing matvecs.
>>>>>>
>>>>>>
>>>>>>   PETSc matrix-vector products use BLAS GMEV matrix-vector products
>>>>>> for the computation so in theory they should already be well-optimized
>>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>> Sreeram
>>>>>>
>>>>>>
>>>>>>
>>>>> <dump.0>
>>
>>
>> <dump.0>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20231214/29b82d65/attachment-0001.html>