[petsc-users] MatMatMul inefficient
Pierre Jolivet
pierre at joliv.et
Wed Feb 15 11:15:47 CST 2023
Thank you for the reproducer.
I didn’t realize your test case was _this_ small.
Still, you are not setting the MatType of Q, and PETSc tends to only like AIJ, so everything defaults to this type.
So instead of doing C=AB with a sparse A and a dense B, it does a sparse-sparse product which is much costlier.
If you add call MatSetType(Q,MATDENSE,ierr) before the MatLoad(), you will then get:
Running with 1 processors
AQ time using MatMatMul 1.0620000000471919E-003
AQ time using 6 MatMul 1.4270000001488370E-003
Not an ideal efficiency (still greater than 1 though, so we are in the clear), but things will get better if you either increase the size of A or Q.
Thanks,
Pierre
> On 15 Feb 2023, at 4:34 PM, Guido Margherita <margherita.guido at epfl.ch> wrote:
>
> Hi,
>
> You can find the reproducer at this link https://github.com/margheguido/Miniapp_MatMatMul , including the matrices I used.
> I have trouble undrerstanding what is different in my case from the one you referenced me to.
>
> Thank you so much,
> Margherita
>
>> Il giorno 13 feb 2023, alle ore 3:51 PM, Pierre Jolivet <pierre at joliv.et> ha scritto:
>>
>> Could you please share a reproducer?
>> What you are seeing is not typical of the performance of such a kernel, both from a theoretical or a practical (see fig. 2 of https://joliv.et/article.pdf) point of view.
>>
>> Thanks,
>> Pierre
>>
>>> On 13 Feb 2023, at 3:38 PM, Guido Margherita via petsc-users <petsc-users at mcs.anl.gov> wrote:
>>>
>>> A is a sparse MATSEQAIJ, Q is dense.
>>>
>>> Thanks,
>>> Margherita
>>>
>>>> Il giorno 13 feb 2023, alle ore 3:27 PM, knepley at gmail.com ha scritto:
>>>>
>>>> On Mon, Feb 13, 2023 at 9:21 AM Guido Margherita via petsc-users <petsc-users at mcs.anl.gov> wrote:
>>>> Hi all,
>>>>
>>>> I realised that performing a matrix-matrix multiplication using the function MatMatMult it is not at all computationally efficient with respect to performing N times a matrix-vector multiplication with MatMul, being N the number of columns of the second matrix in the product.
>>>> When I multiply I matrix A 46816 x 46816 to a matrix Q 46816 x 6, the MatMatMul function is indeed 6 times more expensive than 6 times a call to MatMul, when performed sequentially (0.04056 s vs 0.0062 s ). When the same code is run in parallel the gap grows even more, being10 times more expensive.
>>>> Is there an explanation for it?
>>>>
>>>> So we can reproduce this, what kind of matrix is A? I am assuming that Q is dense.
>>>>
>>>> Thanks,
>>>>
>>>> Matt
>>>>
>>>>
>>>> t1 = MPI_Wtime()
>>>> call MatMatMult(A,Q,MAT_INITIAL_MATRIX, PETSC_DEFAULT_REAL, AQ, ierr )
>>>> t2 = MPI_Wtime()
>>>> t_MatMatMul = t2-t1
>>>>
>>>> t_MatMul=0.0
>>>> do j = 0, m-1
>>>> call MatGetColumnVector(Q, q_vec, j,ierr)
>>>>
>>>> t1 = MPI_Wtime()
>>>> call MatMult(A, q_vec, aq_vec, ierr)
>>>> t2 = MPI_Wtime()
>>>>
>>>> t_MatMul = t_MatMul + t2-t1
>>>> end do
>>>>
>>>> Thank you,
>>>> Margherita Guido
>>>>
>>>>
>>>>
>>>> --
>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>>>> -- Norbert Wiener
>>>>
>>>> https://www.cse.buffalo.edu/~knepley/
>>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20230215/9aaf11c5/attachment.html>
More information about the petsc-users
mailing list