[petsc-users] 2D Partitioning matrix-shell and KSP

Wed Sep 20 13:34:58 CDT 2023

Thank you for your help. I will try this solution.

Sreeram

On Wed, Sep 20, 2023 at 9:24 AM Barry Smith <bsmith at petsc.dev> wrote:

>
>   Use VecCreate(), VecSetSizes(), VecSetType() and MatCreate(),
> MatSetSizes(), and MatSetType() instead of the convience functions
> VecCreateMPICUDA() and MatCreateShell().
>
>
> On Sep 19, 2023, at 8:44 PM, Sreeram R Venkat <srvenkat at utexas.edu> wrote:
>
> Thank you for your reply.
>
> Let's call this matrix *M*:
> (A B C D)
> (E F G H)
> (I  J K L)
>
> Now, instead of doing KSP with just *M*, what if I want *M^TM*? In this
> case, the matvec implementation would be as follows:
>
>
>    - same partitioning of blocks A, B, ..., L among the 12 MPI ranks
>    - matvec looks like:
>
> (a)                  (w)
> (b) = (*M^TM* ) (x)
> (c)                   (y)
> (d)                   (z)
>
>    - w, x, y, z stored on ranks A, B, C, D (as before)
>    - a, b, c, d now also stored on ranks A, B, C, D
>
> Based on your message, I believe using a PetscLayout for both the
> (a,b,c,d) and (w,x,y,z) vector of (number of columns of A, number of
> columns of B, number of columns of C, number of columns of
> D,0,0,0,0,0,0,0,0,0) should work.
>
>
> I see there are functions "VecSetLayout" and "MatSetLayouts" to set the
> PetscLayouts of the matrix and vectors. When I create the vectors (I need
> VecCreateMPICUDA) or matrix shell (with MatCreateShell), I need to pass the
> local and global sizes. I'm not sure what to do there.
>
>
> Thanks,
> Sreeram
>
> On Tue, Sep 19, 2023, 7:13 PM Barry Smith <bsmith at petsc.dev> wrote:
>
>>
>>    The PetscLayout local sizes for PETSc (a,b,c) vector (0,0,0,number of
>> rows of D, 0,0,0, number of rows of H, 0,0,0,number of rows of L)
>>
>>
>>    The PetscLayout local sizes for PETSc (w,x,y,z) vector (number of
>> columns of A, number of columns of B, number of columns of C, number of
>> columns of D,0,0,0,0,0,0,0,0,0)
>>
>>    The left and right layouts of the shell matrix need to match the two
>> above.
>>
>>    There is a huge problem. KSP is written assuming that the left vector
>> layout is the same as the right vector layout. So it can do dot products
>> MPI rank by MPI rank without needing to send individual vector values
>> around.
>>
>>    I don't it makes sense to use PETSc with such vector decompositions as
>> you would like.
>>
>>   Barry
>>
>>
>>
>> On Sep 19, 2023, at 7:44 PM, Sreeram R Venkat <srvenkat at utexas.edu>
>> wrote:
>>
>> With the example you have given, here is what I would like to do:
>>
>>    - 12 MPI ranks
>>    - Each rank has one block (rank 0 has A, rank 1 has B, ..., rank 11
>>    has L) - to make the rest of this easier I'll refer to the rank containing
>>    block A as "rank A", and so on
>>    - rank A, rank B, rank C, and rank D have w, x, y, z respectively -
>>    the first step of the custom matvec implementation broadcasts w to rank E
>>    and rank I (similarly x is broadcast to rank F and rank J ...)
>>    - at the end of the matvec computation, ranks D, H, and L have a, b,
>>    and c respectively
>>
>> Thanks,
>> Sreeram
>>
>>
>> On Tue, Sep 19, 2023 at 6:23 PM Barry Smith <bsmith at petsc.dev> wrote:
>>
>>>
>>>  (  a )       (  A  B  C  D ) (   w )
>>>  (  b )   =  (  E  F  G H  ) (  x )
>>>  (  c )        ( I    J   K L  )  ( y  )
>>>                                        ( z  )
>>>
>>> I have no idea what "The input vector is partitioned across each row,
>>> and the output vector is partitioned across each column" means.
>>>
>>> Anyways the shell matrix needs to live on MPI_COMM_WORLD, as do both the
>>> (a,b,c) and (w,x,y,z) vector.
>>>
>>> Now how many MPI ranks do you want to do the compution on? 12?
>>> Do you want one matrix A .. Z on each rank?
>>>
>>> Do you want the (a,b,c) vector spread over all ranks? What about the (w,x,y,z)
>>> vector?
>>>
>>>   Barry
>>>
>>>
>>>
>>> On Sep 19, 2023, at 4:42 PM, Sreeram R Venkat <srvenkat at utexas.edu>
>>> wrote:
>>>
>>> I have a custom implementation of a matrix-vector product that
>>> inherently relies on a 2D processor partitioning of the matrix. That is, if
>>> the matrix looks like:
>>>
>>> A B C D
>>> E F G H
>>> I  J K L
>>>
>>> in block form, we use 12 processors, each having one block. The input
>>> vector is partitioned across each row, and the output vector is partitioned
>>> across each column.
>>>
>>> Each processor has 3 communicators: the WORLD_COMM, a ROW_COMM, and a
>>> COL_COMM. The ROW/COL communicators are used to do reductions over
>>> rows/columns of processors.
>>>
>>> With this setup, I am a bit confused about how to set up the matrix
>>> shell. The "MatCreateShell" function only accepts one communicator. If I
>>> give the WORLD_COMM, the local/global sizes won't match since PETSc will
>>> try to multiply local_size * total_processors instead of local_size *
>>> processors_per_row (or col). I have gotten around this temporarily by
>>> giving ROW_COMM here instead. What I think happens is a different MatShell
>>> is created on each row, but when computing the matvec, they all work
>>> together.
>>>
>>> However, if I try to use KSP (CG) with this setup (giving ROW_COMM as
>>> the communicator), the process hangs. I believe this is due to the
>>> partitioning of the input/output vectors. The matvec itself is fine, but
>>> the inner products and other steps of CG fail. In fact, if I restrict to
>>> the case where I only have one row of processors, I am able to successfully
>>> use KSP.
>>>
>>> Is there a way to use KSP with this 2D partitioning setup when there are
>>> multiple rows of processors? I'd also prefer to work with one global
>>> MatShell object instead of this one object per row thing that I'm doing
>>> right now.
>>>
>>> Thanks for your help,
>>> Sreeram
>>>
>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20230920/37475bd8/attachment-0001.html>