<div dir="ltr">Dear Jose and Petsc users, I implemented the parallel matrix-vector product and it works meaning that it produces a result but it is different from the result produced with a single task.<div>Obviously, I could be wrong in my implementation but what puzzles me is that the <b>input </b>vector (x) to the product is different running with one and two tasks and this is from the very first iteration (so it can not be due to a previous error in the product).</div><div>I checked that X is different with one and two tasks with the following (naive) code<br></div><div>PetscErrorCode MatMult_TM(Mat A,Vec x,Vec y) {</div>  void              *ctx;<br>  PetscInt          nx /* ,lo,i,j*/;<br>  const PetscScalar *px;<br>  PetscScalar       *py;<br>  MPI_Comm          comm;<br>  PetscFunctionBeginUser;<br>  PetscCall(MatShellGetContext(A,&ctx));<br>  PetscCall(VecGetLocalSize(x,&nx));<br>  PetscCall(PetscObjectGetComm((PetscObject)A,&comm));<br>  <br>  //  nx = *(int*)ctx;<br>  PetscCall(VecGetArrayRead(x,&px));<br>  PetscCall(VecGetArray(y,&py));<br> <br>  for(int i=0;i<nx;i++){ printf("task %d, px[%d]=%f, w[%d]=%f\n",myrank,i+offset,px[i],i+offset,w[i+offset]); }<br>  PetscCall(MPI_Barrier(comm));<br>  exit(0);<br> ......<div>} <br></div><div><br></div><div>Then I reordered the output obtained with one and two tasks. The first part of the x vector is very similar (but not exactly the same) using one and two tasks but the second part (belonging to the second task) is pretty different</div><div>(here "offset" is   offset=(n/size)*myrank;)</div><div>I create the matrix shell with</div><div>  PetscCall(MatCreateShell(PETSC_COMM_WORLD,PETSC_DECIDE,PETSC_DECIDE,N,N,&n,&A));<br></div><div>I am sure I am doing something wrong but I don't know what I need to look at.</div><div>Thanks in advance!</div><div>Mario</div><div><br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">Il giorno ven 17 giu 2022 alle ore 17:56 Mario Rossi <<a href="mailto:dfatiac@gmail.com">dfatiac@gmail.com</a>> ha scritto:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr">Thanks a lot, Jose!<div>I looked at the eps folder (where I found the test8.c that has been my starting point) but I did not look at the nep folder (my fault!)</div><div>Thanks again,</div><div>Mario</div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">Il giorno ven 17 giu 2022 alle ore 17:34 Jose E. Roman <<a href="mailto:jroman@dsic.upv.es" target="_blank">jroman@dsic.upv.es</a>> ha scritto:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">You can use VecGetOwnershipRange() to determine the range of global indices corresponding to the local portion of a vector, and VecGetArray() to access the values. In SLEPc, you can assume that X and Y will have the same parallel distribution.<br>

<br>

For an example of a shell matrix that implements the matrix-vector product in parallel, have a look at this: <a href="https://slepc.upv.es/documentation/current/src/nep/tutorials/ex21.c.html" rel="noreferrer" target="_blank">https://slepc.upv.es/documentation/current/src/nep/tutorials/ex21.c.html</a><br>

It is a simple tridiagonal example, where neighborwise communication is done with two calls to MPI_Sendrecv().<br>

<br>

Jose<br>

<br>

<br>

> El 17 jun 2022, a las 17:21, Mario Rossi <<a href="mailto:dfatiac@gmail.com" target="_blank">dfatiac@gmail.com</a>> escribió:<br>

> <br>

> I need to find the largest eigenvalues (say the first three) of a very large matrix and I am using<br>

> a combination of PetSc and SLEPc. In particular, I am using a shell matrix. I wrote a "custom"<br>

> matrix-vector product and everything works fine in serial (one task) mode for a "small" case.<br>

> For the real case, I need multiple (at least 128) tasks for memory reasons so I need a parallel variant of the custom matrix-vector product. I know exactly how to write the parallel variant <br>

> (in plain MPI) but I am, somehow, blocked because it is not clear to me what each task receives<br>

> and what is expected to provide in the parallel matrix-vector product.<br>

> More in detail, with a single task, the function receives the full X vector and is expected to provide the full Y vector resulting from Y=A*X. <br>

> What does it happen with multiple tasks? If I understand correctly<br>

> in the matrix shell definition, I can choose to split the matrix into blocks of rows so that the matrix-vector function should compute a block of elements of the vector Y but does it receive only the corresponding subset of the X (input vector)? (this is what I guess happens) and in output, does<br>

> each task return its subset of elements of Y as if it were the whole array and then PetSc manages all the subsets? Is there anyone who has a working example of a parallel matrix-vector product for matrix shell?<br>

> Thanks in advance for any help you can provide!<br>

> Mario<br>

> i<br>

> <br>

<br>

</blockquote></div>

</blockquote></div>