<html><head><meta http-equiv="Content-Type" content="text/html; charset=us-ascii"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class=""><div class=""><br class=""></div> I think you can again use MatDenseGetArray() and do the copies directly respecting the shift that you desire. Each process will just do the <div class="">local rows so you need not worry about parallelism. </div><div class=""><br class=""></div><div class=""> I think it may be as simple as get the array pointer for A, shift it by number of local rows * number of columns then do a PetscArraycpy() to copy the B values into the shifted location in A.</div><div class=""><br class=""></div><div class="">Barry</div><div class=""> <br class=""><div><br class=""><blockquote type="cite" class=""><div class="">On Dec 14, 2020, at 5:18 PM, Roland Richter <<a href="mailto:roland.richter@ntnu.no" class="">roland.richter@ntnu.no</a>> wrote:</div><br class="Apple-interchange-newline"><div class="">
<meta http-equiv="content-type" content="text/html; charset=UTF-8" class="">
<div class=""><p class="">Dear all,</p><p class="">I am currently working on the transformation of an algorithm
implemented using armadillo into PETSc. It is a forward/backward
transformation, and boils down to the following steps (for the
forward transformation):</p><p class="">Assumed I have matrices A and B, defined as <br class="">
</p><p class="">A = |aa ab ac ad|<br class="">
|ae af ag ah|<br class="">
|ai aj ak al|</p><p class="">B = |ba bb bc|<br class="">
|be bf bg|<br class="">
|bi bj bk|</p><p class="">with the number of rows in A and B always equal, but number of
columns in B always less or equal than half the number of columns
in A (Example here is only for demonstration, I am aware of that 3
is not smaller or equal than 2).</p><p class="">Moreover, I have vectors x and y, with x defined as</p><p class="">x = |xa xb xc xd|</p><p class="">and y defined as <br class="">
</p><p class="">y = |ya yb yc|</p><p class="">The number of elements in x corresponds to the number of columns
in A, and the number of elements y accordingly correspond to the
number of columns in B.</p><p class="">Now, the transformation can be described as</p>
<ul class="">
<li class="">Set all values in A to zero</li>
<li class="">Copy B into A with an offset of a0:</li>
<ul class="">
<li class="">A(a0 = 1) = |0 ba bb bc|<br class="">
|0 be bf bg|<br class="">
|0 bi bj bk|</li>
</ul>
<li class="">Multiply every row in A elementwise with y, including offset,
resulting in</li>
<ul class="">
<li class="">A(a0 = 1) = |0 ba*ya bb*yb bc*yc|<br class="">
|0 be*ya bf*yb bg*yc|<br class="">
|0 bi*ya bj*yb bk*yc|</li>
</ul>
<li class="">Apply a 1d-FFT over each row of A, resulting in A'<br class="">
</li>
<li class="">Multiply every row in A' elementwise with x, resulting in <br class="">
</li>
<ul class="">
<li class="">A'(a0 = 1) = |aa'*xa (ba*ya)'*xb (bb*yb)'*xc (bc*yc)'*xd|<br class="">
|ae'*xa (be*ya)'*xb (bf*yb)'*xc
(bg*yc)'*xd|<br class="">
|ai'*xa (bi*ya)'*xb (bj*yb)'*xc
(bk*yc)'*xd|</li>
</ul>
</ul><p class="">Based on earlier questions, I already know how to apply a vector
to each row of a matrix (by using .diag()) and how to apply an FFT
over each row of a distributed matrix by using FFTW. Still, I am
not aware of a method for copying B into A with an offset, and
therefore I would have to iterate over each row for the copy
process, which might slow down the process. Therefore, is there a
way I could make this process more efficient using the built-in
functions in PETSc? Unfortunately, I am not that familiar with all
the functions yet.</p><p class="">Thanks!</p><p class="">Roland<br class="">
</p>
</div>
</div></blockquote></div><br class=""></div></body></html>