[petsc-users] PetscArraycpy only copies half of the entries of the matrix rows

Matthew Knepley knepley at gmail.com
Thu Dec 17 10:06:49 CST 2020


On Thu, Dec 17, 2020 at 7:00 AM Roland Richter <roland.richter at ntnu.no>
wrote:

> Dear all,
>
> I wanted to use PetscArraycpy for copying a part of one complex matrix A
> with a row length of a_len and an offset of a_off into another matrix B
> with a row length of b_len (smaller than a_len - a_off), using the
> following code snippet:
>
> *        PetscScalar *A_ptr, *B_ptr;*
> *        MatDenseGetArray(A, &A_ptr);*
> *        MatDenseGetArray(B, &B_ptr);*
> *        MatView(A, PETSC_VIEWER_STDOUT_WORLD);*
> *        for(size_t i = 0; i < num_local_rows; ++i) {*
> *            PetscArraycpy(B_ptr + i * b_len, (2 * a_off + A_ptr) + i *
> a_len, b_len);*
> *        }*
>
> *        MatAssemblyBegin(B, MAT_FINAL_ASSEMBLY);*
> *        MatAssemblyEnd(B, MAT_FINAL_ASSEMBLY);*
> *        MatView(B, PETSC_VIEWER_STDOUT_WORLD);*
>
> When printing the first row of matrix A (with a_len = 128, a_off = 76 and
> b_len = 26), I get
>
> *0.0000000000000000e+00 + 0.0000000000000000e+00i 0.0000000000000000e+00 +
> 0.0000000000000000e+00i 0.0000000000000000e+00 + 0.0000000000000000e+00i
> -0.0000000000000000e+00 + 0.0000000000000000e+00i 0.0000000000000000e+00 +
> 0.0000000000000000e+00i 0.0000000000000000e+00 + 0.0000000000000000e+00i
> -0.0000000000000000e+00 + 0.0000000000000000e+00i 0.0000000000000000e+00 +
> 0.0000000000000000e+00i 0.0000000000000000e+00 + 0.0000000000000000e+00i
> 0.0000000000000000e+00 + 0.0000000000000000e+00i 0.0000000000000000e+00 +
> 0.0000000000000000e+00i -0.0000000000000000e+00 + 0.0000000000000000e+00i
> 0.0000000000000000e+00 + 0.0000000000000000e+00i 0.0000000000000000e+00 +
> 0.0000000000000000e+00i 0.0000000000000000e+00 + 0.0000000000000000e+00i
> 0.0000000000000000e+00 + 0.0000000000000000e+00i 0.0000000000000000e+00 +
> 0.0000000000000000e+00i 0.0000000000000000e+00 + 0.0000000000000000e+00i
> -0.0000000000000000e+00 + 0.0000000000000000e+00i 0.0000000000000000e+00 +
> 0.0000000000000000e+00i 0.0000000000000000e+00 + 0.0000000000000000e+00i
> -0.0000000000000000e+00 + 0.0000000000000000e+00i 0.0000000000000000e+00 +
> 0.0000000000000000e+00i 0.0000000000000000e+00 + 0.0000000000000000e+00i
> 0.0000000000000000e+00 + 0.0000000000000000e+00i 0.0000000000000000e+00 +
> 0.0000000000000000e+00i -0.0000000000000000e+00 + 0.0000000000000000e+00i
> 0.0000000000000000e+00 + 0.0000000000000000e+00i 0.0000000000000000e+00 +
> 0.0000000000000000e+00i 0.0000000000000000e+00 + 0.0000000000000000e+00i
> 0.0000000000000000e+00 + 0.0000000000000000e+00i 0.0000000000000000e+00 +
> 0.0000000000000000e+00i -0.0000000000000000e+00 + 0.0000000000000000e+00i
> 0.0000000000000000e+00 + 0.0000000000000000e+00i 0.0000000000000000e+00 +
> 0.0000000000000000e+00i -0.0000000000000000e+00 + 0.0000000000000000e+00i
> 0.0000000000000000e+00 + 0.0000000000000000e+00i 0.0000000000000000e+00 +
> 0.0000000000000000e+00i 0.0000000000000000e+00 + 0.0000000000000000e+00i
> 0.0000000000000000e+00 + 0.0000000000000000e+00i 0.0000000000000000e+00 +
> 0.0000000000000000e+00i -0.0000000000000000e+00 + 0.0000000000000000e+00i
> 0.0000000000000000e+00 + 0.0000000000000000e+00i 0.0000000000000000e+00 +
> 0.0000000000000000e+00i 0.0000000000000000e+00 + 0.0000000000000000e+00i
> 0.0000000000000000e+00 + 0.0000000000000000e+00i -0.0000000000000000e+00 +
> 0.0000000000000000e+00i 0.0000000000000000e+00 + 0.0000000000000000e+00i
> 0.0000000000000000e+00 + 0.0000000000000000e+00i -0.0000000000000000e+00 +
> 0.0000000000000000e+00i 0.0000000000000000e+00 + 0.0000000000000000e+00i
> 0.0000000000000000e+00 + 0.0000000000000000e+00i -0.0000000000000000e+00 +
> 0.0000000000000000e+00i 0.0000000000000000e+00 + 0.0000000000000000e+00i
> 0.0000000000000000e+00 + 0.0000000000000000e+00i -0.0000000000000000e+00 +
> 0.0000000000000000e+00i 0.0000000000000000e+00 + 0.0000000000000000e+00i
> 0.0000000000000000e+00 + 0.0000000000000000e+00i 0.0000000000000000e+00 +
> 0.0000000000000000e+00i 0.0000000000000000e+00 + 0.0000000000000000e+00i
> 0.0000000000000000e+00 + 0.0000000000000000e+00i 0.0000000000000000e+00 +
> 0.0000000000000000e+00i 0.0000000000000000e+00 + 0.0000000000000000e+00i
> -0.0000000000000000e+00 + 0.0000000000000000e+00i 0.0000000000000000e+00 +
> 0.0000000000000000e+00i 0.0000000000000000e+00 + 0.0000000000000000e+00i
> -0.0000000000000000e+00 + 0.0000000000000000e+00i 0.0000000000000000e+00 +
> 0.0000000000000000e+00i 0.0000000000000000e+00 + 0.0000000000000000e+00i
> 0.0000000000000000e+00 + 0.0000000000000000e+00i 0.0000000000000000e+00 +
> 0.0000000000000000e+00i -0.0000000000000000e+00 + 0.0000000000000000e+00i
> 0.0000000000000000e+00 + 0.0000000000000000e+00i 0.0000000000000000e+00 +
> 0.0000000000000000e+00i 0.0000000000000000e+00 + 0.0000000000000000e+00i
> 0.0000000000000000e+00 + 0.0000000000000000e+00i -0.0000000000000000e+00 +
> 0.0000000000000000e+00i -7.5118186821231378e-02 + -1.2515848507502547e-01i
> 5.7593629917958706e+00 + 1.6535197175842331e+00i -6.3062119866941906e+01 +
> 3.2118985283369987e+01i 1.6228535942636518e+02 + -4.4588492144691378e+02i
> 8.7350162264986420e+02 + 2.0568440963147814e+03i -7.4258479521622921e+03 +
> -3.3031631388498363e+03i 2.2699374989663269e+04 + -7.8289291098031481e+03i
> -2.7846379282467926e+04 + 5.2456793075148809e+04i -3.2554674832896777e+04 +
> -1.2108819252524960e+05i 1.9430868047197413e+05 + 1.2114559011378702e+05i
> -3.5831799834334152e+05 + 7.0086227392363056e+04i 3.0028983479603863e+05 +
> -4.1447894788669585e+05i 7.8224949502036819e+04 + 6.2926756374162808e+05i
> -5.3474873053744854e+05 + -4.4718914789259754e+05i 6.8111038372267899e+05 +
> -3.6947593166740131e+04i -4.1287212326113920e+05 + 4.2925417635846150e+05i
> 7.1098224367113344e+03 + -4.6490743916366581e+05i 2.1807010096419850e+05 +
> 2.4106178223450572e+05i -2.0304162108015743e+05 + -1.7254769976182859e+04i
> 9.0164628688356752e+04 + -7.0830186001321214e+04i -8.9050769071193699e+03 +
> 5.7267241933255813e+04i -1.4789632694550470e+04 + -2.1786332309775924e+04i
> 1.0491004489879153e+04 + 2.3873712516742830e+03i -3.4183915782335853e+03 +
> 1.9886861075931499e+03i 3.7807432692260045e+02 + -1.2521184263406540e+03i
> 0.0000000000000000e+00 + 0.0000000000000000e+00i 0.0000000000000000e+00 +
> 0.0000000000000000e+00i 0.0000000000000000e+00 + 0.0000000000000000e+00i
> -0.0000000000000000e+00 + 0.0000000000000000e+00i 0.0000000000000000e+00 +
> 0.0000000000000000e+00i 0.0000000000000000e+00 + 0.0000000000000000e+00i
> -0.0000000000000000e+00 + 0.0000000000000000e+00i 0.0000000000000000e+00 +
> 0.0000000000000000e+00i 0.0000000000000000e+00 + 0.0000000000000000e+00i
> 0.0000000000000000e+00 + 0.0000000000000000e+00i 0.0000000000000000e+00 +
> 0.0000000000000000e+00i -0.0000000000000000e+00 + 0.0000000000000000e+00i
> 0.0000000000000000e+00 + 0.0000000000000000e+00i 0.0000000000000000e+00 +
> 0.0000000000000000e+00i -0.0000000000000000e+00 + 0.0000000000000000e+00i
> 0.0000000000000000e+00 + 0.0000000000000000e+00i 0.0000000000000000e+00 +
> 0.0000000000000000e+00i -0.0000000000000000e+00 + 0.0000000000000000e+00i
> 0.0000000000000000e+00 + 0.0000000000000000e+00i 0.0000000000000000e+00 +
> 0.0000000000000000e+00i 0.0000000000000000e+00 + 0.0000000000000000e+00i
> 0.0000000000000000e+00 + 0.0000000000000000e+00i 0.0000000000000000e+00 +
> 0.0000000000000000e+00i 0.0000000000000000e+00 + 0.0000000000000000e+00i
> 0.0000000000000000e+00 + 0.0000000000000000e+00i 0.0000000000000000e+00 +
> 0.0000000000000000e+00i*
>
> and for the first row of B I get
>
> *-0.0000000000000000e+00 + 0.0000000000000000e+00i -7.5118186821231378e-02
> + -1.2515848507502547e-01i 5.7593629917958706e+00 + 1.6535197175842331e+00i
> -6.3062119866941906e+01 + 3.2118985283369987e+01i 1.6228535942636518e+02 +
> -4.4588492144691378e+02i 8.7350162264986420e+02 + 2.0568440963147814e+03i
> -7.4258479521622921e+03 + -3.3031631388498363e+03i 2.2699374989663269e+04 +
> -7.8289291098031481e+03i -2.7846379282467926e+04 + 5.2456793075148809e+04i
> -3.2554674832896777e+04 + -1.2108819252524960e+05i 1.9430868047197413e+05 +
> 1.2114559011378702e+05i -3.5831799834334152e+05 + 7.0086227392363056e+04i
> 3.0028983479603863e+05 + -4.1447894788669585e+05i 0.0000000000000000e+00 +
> 0.0000000000000000e+00i 0.0000000000000000e+00 + 0.0000000000000000e+00i
> 0.0000000000000000e+00 + 0.0000000000000000e+00i 0.0000000000000000e+00 +
> 0.0000000000000000e+00i 0.0000000000000000e+00 + 0.0000000000000000e+00i
> 0.0000000000000000e+00 + 0.0000000000000000e+00i 6.9004764357806349e-310 +
> 6.9004764357806349e-310i 6.9004764365446580e-310 + 6.9004764300000669e-310i
> 0.0000000000000000e+00 + 0.0000000000000000e+00i 0.0000000000000000e+00 +
> 0.0000000000000000e+00i 0.0000000000000000e+00 + 0.0000000000000000e+00i
> 0.0000000000000000e+00 + 0.0000000000000000e+00i 0.0000000000000000e+00 +
> 0.0000000000000000e+00i*
>
> Apparently, only 13 complex values have been copied, and not 26. Moreover,
> if my source destination is chosen to be (a_off + A_ptr) instead, I will
> just copy 0-values. When increasing the number of values I would like to
> copy, nothing changes (except getting a segfault for sufficient large
> values).
>
> Why does that happen? And how can I copy all values into the second
> matrix, and not only half of them?
>
Can you make a minimal example? It looks like it should work. If I can run
the code, I can make it work.

> Another question: Is there a parallel version of that function, to copy
> all local rows in parallel, or do I have to write it myself, for example by
> using OpenMP?
>
The copy should be vectorized by the compiler. If you have idle cores
waiting for something, you could possibly use OpenMP. However,  as Jed
points out,
the time to fork and join is likely to exceed your speedup.

  Thanks,

     Matt

> Thanks!
>
> Roland
>
>
>
>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20201217/53e50b65/attachment-0001.html>


More information about the petsc-users mailing list