<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body style="overflow-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;">
Hi, 
<div><br>
</div>
<div>You can find the reproducer at this link <a href="https://github.com/margheguido/Miniapp_MatMatMul">https://github.com/margheguido/Miniapp_MatMatMul</a> , including the matrices I used.
<div>I have trouble undrerstanding what is different in my case from the one you referenced me to. </div>
<div><br>
</div>
<div>Thank you so much,</div>
<div>Margherita <br>
<div><br>
<blockquote type="cite">
<div>Il giorno 13 feb 2023, alle ore 3:51 PM, Pierre Jolivet <pierre@joliv.et> ha scritto:</div>
<br class="Apple-interchange-newline">
<div>
<div dir="auto">
<div dir="ltr">
<div dir="ltr"></div>
<div dir="ltr">Could you please share a reproducer?</div>
<div dir="ltr">What you are seeing is not typical of the performance of such a kernel, both from a theoretical or a practical (see fig. 2 of <a href="https://joliv.et/article.pdf">https://joliv.et/article.pdf</a>) point of view.</div>
<div dir="ltr"><br>
</div>
<div dir="ltr">Thanks,</div>
<div dir="ltr">Pierre</div>
<div dir="ltr"><br>
<blockquote type="cite">On 13 Feb 2023, at 3:38 PM, Guido Margherita via petsc-users <petsc-users@mcs.anl.gov> wrote:<br>
<br>
</blockquote>
</div>
<blockquote type="cite">
<div dir="ltr"><span>A is a sparse MATSEQAIJ, Q is dense.</span><br>
<span></span><br>
<span>Thanks,</span><br>
<span>Margherita </span><br>
<span></span><br>
<blockquote type="cite"><span>Il giorno 13 feb 2023, alle ore 3:27 PM, knepley@gmail.com ha scritto:</span><br>
</blockquote>
<blockquote type="cite"><span></span><br>
</blockquote>
<blockquote type="cite"><span>On Mon, Feb 13, 2023 at 9:21 AM Guido Margherita via petsc-users <petsc-users@mcs.anl.gov> wrote:</span><br>
</blockquote>
<blockquote type="cite"><span>Hi all, </span><br>
</blockquote>
<blockquote type="cite"><span></span><br>
</blockquote>
<blockquote type="cite"><span>I realised that performing a matrix-matrix multiplication using the function MatMatMult it is not at all computationally efficient with respect to performing N times a matrix-vector multiplication with MatMul, being N the number
 of columns of the second matrix in the product. </span><br>
</blockquote>
<blockquote type="cite"><span>When I multiply I matrix A  46816 x 46816 to a matrix Q  46816 x 6, the MatMatMul function is indeed 6 times more expensive than 6 times a call to MatMul, when performed sequentially (0.04056  s vs 0.0062 s ). When the same code
 is run in parallel the gap grows even more, being10 times more expensive.</span><br>
</blockquote>
<blockquote type="cite"><span> Is there an explanation for it?</span><br>
</blockquote>
<blockquote type="cite"><span></span><br>
</blockquote>
<blockquote type="cite"><span>So we can reproduce this, what kind of matrix is A? I am assuming that Q is dense.</span><br>
</blockquote>
<blockquote type="cite"><span></span><br>
</blockquote>
<blockquote type="cite"><span> Thanks,</span><br>
</blockquote>
<blockquote type="cite"><span></span><br>
</blockquote>
<blockquote type="cite"><span>    Matt</span><br>
</blockquote>
<blockquote type="cite"><span></span><br>
</blockquote>
<blockquote type="cite"><span></span><br>
</blockquote>
<blockquote type="cite"><span>t1 = MPI_Wtime()</span><br>
</blockquote>
<blockquote type="cite"><span>call MatMatMult(A,Q,MAT_INITIAL_MATRIX, PETSC_DEFAULT_REAL, AQ, ierr )</span><br>
</blockquote>
<blockquote type="cite"><span>t2 = MPI_Wtime() </span><br>
</blockquote>
<blockquote type="cite"><span>t_MatMatMul = t2-t1</span><br>
</blockquote>
<blockquote type="cite"><span></span><br>
</blockquote>
<blockquote type="cite"><span>t_MatMul=0.0</span><br>
</blockquote>
<blockquote type="cite"><span>do j = 0, m-1</span><br>
</blockquote>
<blockquote type="cite"><span>       call MatGetColumnVector(Q, q_vec, j,ierr)</span><br>
</blockquote>
<blockquote type="cite"><span></span><br>
</blockquote>
<blockquote type="cite"><span>       t1 = MPI_Wtime()</span><br>
</blockquote>
<blockquote type="cite"><span>       call MatMult(A, q_vec, aq_vec, ierr) </span>
<br>
</blockquote>
<blockquote type="cite"><span>       t2 = MPI_Wtime()</span><br>
</blockquote>
<blockquote type="cite"><span></span><br>
</blockquote>
<blockquote type="cite"><span>       t_MatMul = t_MatMul + t2-t1</span><br>
</blockquote>
<blockquote type="cite"><span>end do</span><br>
</blockquote>
<blockquote type="cite"><span></span><br>
</blockquote>
<blockquote type="cite"><span>Thank you, </span><br>
</blockquote>
<blockquote type="cite"><span>Margherita Guido</span><br>
</blockquote>
<blockquote type="cite"><span></span><br>
</blockquote>
<blockquote type="cite"><span></span><br>
</blockquote>
<blockquote type="cite"><span></span><br>
</blockquote>
<blockquote type="cite"><span>-- </span><br>
</blockquote>
<blockquote type="cite"><span>What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.</span><br>
</blockquote>
<blockquote type="cite"><span>-- Norbert Wiener</span><br>
</blockquote>
<blockquote type="cite"><span></span><br>
</blockquote>
<blockquote type="cite"><span>https://www.cse.buffalo.edu/~knepley/</span><br>
</blockquote>
<span></span><br>
</div>
</blockquote>
</div>
</div>
</div>
</blockquote>
</div>
<br>
</div>
</div>
</body>
</html>