[petsc-users] Parallel matrix-vector product with Matrix Shell

Sat Jun 18 01:27:25 CDT 2022

The initial vector of the Krylov method is by default a random vector, which is different when you change the number of processes. To avoid this, you can run with the undocumented option -bv_reproducible_random which will generate the same random initial vector irrespective of the number of processes.

Alternatively, set an initial vector in your code with EPSSetInitialSpace(), see e.g. https://slepc.upv.es/documentation/current/src/eps/tutorials/ex5.c.html

Jose

> El 18 jun 2022, a las 8:13, Mario Rossi <dfatiac at gmail.com> escribió:
> 
> Dear Jose and Petsc users, I implemented the parallel matrix-vector product and it works meaning that it produces a result but it is different from the result produced with a single task.
> Obviously, I could be wrong in my implementation but what puzzles me is that the input vector (x) to the product is different running with one and two tasks and this is from the very first iteration (so it can not be due to a previous error in the product).
> I checked that X is different with one and two tasks with the following (naive) code
> PetscErrorCode MatMult_TM(Mat A,Vec x,Vec y) {
>   void              *ctx;
>   PetscInt          nx /* ,lo,i,j*/;
>   const PetscScalar *px;
>   PetscScalar       *py;
>   MPI_Comm          comm;
>   PetscFunctionBeginUser;
>   PetscCall(MatShellGetContext(A,&ctx));
>   PetscCall(VecGetLocalSize(x,&nx));
>   PetscCall(PetscObjectGetComm((PetscObject)A,&comm));
>   
>   //  nx = *(int*)ctx;
>   PetscCall(VecGetArrayRead(x,&px));
>   PetscCall(VecGetArray(y,&py));
>  
>   for(int i=0;i<nx;i++){ printf("task %d, px[%d]=%f, w[%d]=%f\n",myrank,i+offset,px[i],i+offset,w[i+offset]); }
>   PetscCall(MPI_Barrier(comm));
>   exit(0);
>  ......
> } 
> 
> Then I reordered the output obtained with one and two tasks. The first part of the x vector is very similar (but not exactly the same) using one and two tasks but the second part (belonging to the second task) is pretty different
> (here "offset" is   offset=(n/size)*myrank;)
> I create the matrix shell with
>   PetscCall(MatCreateShell(PETSC_COMM_WORLD,PETSC_DECIDE,PETSC_DECIDE,N,N,&n,&A));
> I am sure I am doing something wrong but I don't know what I need to look at.
> Thanks in advance!
> Mario
> 
> 
> Il giorno ven 17 giu 2022 alle ore 17:56 Mario Rossi <dfatiac at gmail.com> ha scritto:
> Thanks a lot, Jose!
> I looked at the eps folder (where I found the test8.c that has been my starting point) but I did not look at the nep folder (my fault!)
> Thanks again,
> Mario
> 
> Il giorno ven 17 giu 2022 alle ore 17:34 Jose E. Roman <jroman at dsic.upv.es> ha scritto:
> You can use VecGetOwnershipRange() to determine the range of global indices corresponding to the local portion of a vector, and VecGetArray() to access the values. In SLEPc, you can assume that X and Y will have the same parallel distribution.
> 
> For an example of a shell matrix that implements the matrix-vector product in parallel, have a look at this: https://slepc.upv.es/documentation/current/src/nep/tutorials/ex21.c.html
> It is a simple tridiagonal example, where neighborwise communication is done with two calls to MPI_Sendrecv().
> 
> Jose
> 
> 
> > El 17 jun 2022, a las 17:21, Mario Rossi <dfatiac at gmail.com> escribió:
> > 
> > I need to find the largest eigenvalues (say the first three) of a very large matrix and I am using
> > a combination of PetSc and SLEPc. In particular, I am using a shell matrix. I wrote a "custom"
> > matrix-vector product and everything works fine in serial (one task) mode for a "small" case.
> > For the real case, I need multiple (at least 128) tasks for memory reasons so I need a parallel variant of the custom matrix-vector product. I know exactly how to write the parallel variant 
> > (in plain MPI) but I am, somehow, blocked because it is not clear to me what each task receives
> > and what is expected to provide in the parallel matrix-vector product.
> > More in detail, with a single task, the function receives the full X vector and is expected to provide the full Y vector resulting from Y=A*X. 
> > What does it happen with multiple tasks? If I understand correctly
> > in the matrix shell definition, I can choose to split the matrix into blocks of rows so that the matrix-vector function should compute a block of elements of the vector Y but does it receive only the corresponding subset of the X (input vector)? (this is what I guess happens) and in output, does
> > each task return its subset of elements of Y as if it were the whole array and then PetSc manages all the subsets? Is there anyone who has a working example of a parallel matrix-vector product for matrix shell?
> > Thanks in advance for any help you can provide!
> > Mario
> > i
> > 
>