[petsc-users] Filling non-zero values of a Petsc matrix using numpy arrays with non-zero indices and values (petsc4py)

Barry Smith bsmith at petsc.dev
Sat Aug 19 13:28:10 CDT 2023


   The cost of the initial matrix creation to get the row starts and ends is trivial compared to the later computations, so is fine to retain.

> On Aug 19, 2023, at 10:51 AM, Erik Kneller via petsc-users <petsc-users at mcs.anl.gov> wrote:
> 
> Hi All,
> 
> Thank you for the recommendations. The first option provided by Matthew works in petsc4py but required some additional computations to re-organize information. I developed an approach that was easier to build within my current system using the option provided by Hong along with some other posts I found on the user forum. See below for an example of how I integrated Numba and petcs4py so the filling of matrix elements on each processor is done using the optimized machine code produced by Numba (you can eliminate the cleaning step if the number of non-zero elements for each row is well defined). Scaling and overall performance is now satisfactory. 
> 
> I do have another question. In order to make this work I create the Petsc matrix 'A' twice, first to get local ownership and second to define the local elements for a particular processor:
> 
> Algorithm
> ------------
> (1) Create a Petsc matrix 'A' and set size and type
> (2) Get the local row start and end for matrix 'A'
> (3) Define the local non-zero coefficients for the rows owned by processor using a Numba JIT-compiled loop and store result in a csr matrix defined using Scipy. 
> (4) Redefine Petsc matix 'A' using the the local csr elements define in step 3.
> (5) Begin and end assembly
> (6) Define RHS and initial solution vectors
> (7) Solve system 
>  (see code example below for details)
> 
> Steps (1) and (4) appear redundant and potentially sub-optimal from a performance perspective (perhaps due to my limited experience with petscy4py). Is there a better approach in terms of elegance and performance? Also, if this is the optimal syntax could someone elaborate on what exactly is occurring behind the seen in terms of memory allocation?
> 
> Thank you all again for your guidance and insights.
> 
> Modified Version of Dalcin's 2D Poisson Example with Numba
> ----------------------------------------------------------------------------------
> import sys
> from typing import Tuple
> import numpy.typing as npt
> import numpy as np
> from numba import njit
> import petsc4py
> petsc4py.init(sys.argv)
> from petsc4py import PETSc
> import scipy.sparse as sps
> 
> 
> def main() -> None:
>     xnodes = 8000
>     nnz_max = 10
>     
>     ynodes = xnodes
>     N = xnodes*ynodes
>     dx = 1.0/(xnodes + 1)
>     
>     A = PETSc.Mat()
>     A.create(comm=PETSc.COMM_WORLD)
>     A.setSizes((xnodes*ynodes, xnodes*ynodes))
>     A.setType(PETSc.Mat.Type.AIJ)
>     
>     rstart, rend = A.getOwnershipRange()
>     Ascipy = build_csr_matrix(
>         rstart, rend, nnz_max, dx, xnodes, ynodes)
>     csr=(
>         Ascipy.indptr[rstart:rend+1] - Ascipy.indptr[rstart],
>         Ascipy.indices[Ascipy.indptr[rstart]:Ascipy.indptr[rend]],
>         Ascipy.data[Ascipy.indptr[rstart]:Ascipy.indptr[rend]]
>         )
>     A = PETSc.Mat().createAIJ(size=(N,N), csr=csr)
>     
>     A.assemblyBegin()
>     A.assemblyEnd()
>     
>     ksp = PETSc.KSP()
>     ksp.create(comm=A.getComm())
>     ksp.setType(PETSc.KSP.Type.CG)
>     ksp.getPC().setType(PETSc.PC.Type.GAMG)
>     
>     ksp.setOperators(A)
>     ksp.setFromOptions()
>     
>     x, b = A.createVecs()
>     b.set(1.0)
>     
>     ksp.solve(b, x)
>     
>     
> def build_csr_matrix(
>         rstart:int, 
>         rend:int, 
>         nnz_max:int, 
>         dx:float, 
>         xnodes:int, 
>         ynodes:int
> ):
>     Anz, Arow, Acol = build_nonzero_arrays(
>         rstart, rend, nnz_max, dx, xnodes, ynodes
>         )
>     N = xnodes*ynodes
>     Ls = sps.csr_matrix((Anz, (Arow,Acol)), shape=(N,N), dtype=np.float64)
>     return Ls
>   
>   
> def build_nonzero_arrays(
>         rstart:int, 
>         rend:int, 
>         nnz_max:int, 
>         dx:float, 
>         xnodes:int, 
>         ynodes:int
> ) -> Tuple[
>     npt.NDArray[np.float64], 
>     npt.NDArray[np.float64], 
>     npt.NDArray[np.float64]
>     ]:
>     nrows_local = (rend - rstart) + 1
>     Anz_ini = np.zeros((nnz_max*nrows_local), dtype=np.float64)
>     Arow_ini = np.zeros((nnz_max*nrows_local), dtype=np.int32)
>     Acol_ini = np.zeros((nnz_max*nrows_local), dtype=np.int32)
>     icount_nz = define_nonzero_values(
>         rstart, rend, dx, xnodes, ynodes, Anz_ini, Arow_ini, Acol_ini
>         )
>     (
>         Anz, Arow, Acol
>     ) = clean_nonzero_arrays(icount_nz, Anz_ini, Arow_ini, Acol_ini)
>     return Anz, Arow, Acol
> 
> 
> @njit
> def define_nonzero_values(
>         rstart:int, 
>         rend:int,
>         dx:float,
>         xnodes:int,
>         ynodes:int,
>         Anz:npt.NDArray[np.float64],
>         Arow:npt.NDArray[np.int64],
>         Acol:npt.NDArray[np.int64]
> ) -> int:
>     """ Fill matrix A
>     """
>     icount_nz = 0
>     for row in range(rstart, rend):
>         i, j = index_to_grid(row, xnodes)
>         #A[row, row] = 4.0/dx**2
>         Anz[icount_nz] = 4.0/dx**2
>         Arow[icount_nz] = row
>         Acol[icount_nz] = row
>         icount_nz += 1
>         if i > 0:
>             column = row - xnodes
>             #A[row, column] = -1.0/dx**2
>             Anz[icount_nz] = -1.0/dx**2
>             Arow[icount_nz] = row
>             Acol[icount_nz] = column
>             icount_nz += 1
>         if i < xnodes - 1:
>             column = row + xnodes
>             #A[row, column] = -1.0/dx**2
>             Anz[icount_nz] = -1.0/dx**2
>             Arow[icount_nz] = row
>             Acol[icount_nz] = column
>             icount_nz += 1
>         if j > 0:
>             column = row - 1
>             #A[row, column] = -1.0/dx**2
>             Anz[icount_nz] = -1.0/dx**2
>             Arow[icount_nz] = row
>             Acol[icount_nz] = column
>             icount_nz += 1
>         if j < xnodes - 1:
>             column = row + 1
>             #A[row, column] = -1.0/dx**2
>             Anz[icount_nz] = -1.0/dx**2
>             Arow[icount_nz] = row
>             Acol[icount_nz] = column
>             icount_nz += 1
>     return icount_nz
> 
> 
> @njit
> def clean_nonzero_arrays(
>         icount_nz:int, 
>         Anz_ini:npt.NDArray[np.float64], 
>         Arow_ini:npt.NDArray[np.float64], 
>         Acol_ini:npt.NDArray[np.float64]
> ) -> Tuple[
>     npt.NDArray[np.float64], 
>     npt.NDArray[np.float64], 
>     npt.NDArray[np.float64]
>     ]:
>     Anz = np.zeros((icount_nz), dtype=np.float64)
>     Arow = np.zeros((icount_nz), dtype=np.int32)
>     Acol = np.zeros((icount_nz), dtype=np.int32)
>     for i in range(icount_nz):
>         Anz[i] = Anz_ini[i]
>         Arow[i] = Arow_ini[i]
>         Acol[i] = Acol_ini[i]
>     return Anz, Arow, Acol
> 
> 
> @njit
> def index_to_grid(r:int, n:int) -> Tuple[int,int]:
>     """Convert a row number into a grid point."""
>     return (r//n, r%n)
> 
> 
> if __name__ == "__main__":
>     main()
> 
> 
> On Friday, August 18, 2023 at 11:24:36 AM CDT, Zhang, Hong <hongzhang at anl.gov> wrote:
> 
> 
> You can use this to build a PETSc matrix with the index arrays ai,aj and the value array aa:
>     PETSc.Mat().createAIJ(size=(nrows,ncols), csr=(ai,aj,aa))
> 
> Hong (Mr.)
> 
> > On Aug 17, 2023, at 7:37 AM, Erik Kneller via petsc-users <petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov>> wrote:
> > 
> > Hi All,
> > 
> > I need to fill non-zero values of a Petsc matrix via petsc4py for the domain defined by A.getOwnershipRange() using three Numpy arrays: (1) array containing row indices of non-zero value, (2) array containing column indices of non-zero values and (3) array containing the non-zero matrix values. How can one perform this type of filling operation in petsc4py? The method A.setValues does not appear to allow this since it only works on an individual matrix element or a block of matrix elements.
> > 
> > I am using Numpy arrays since they can be computed in loops optimized using Numba on each processor. I also cannot pass the Petsc matrix to a Numba compiled function since type information cannot be inferred. I absolutely need to avoid looping in standard Python to define Petsc matrix elements due to performance issues. I also need to use a standard petscy4py method and avoid writing new C or Fortran wrappers to minimize language complexity.
> > 
> > Example Algorithm Building on Lisandro Dalcin's 2D Poisson Example:
> > ----------------------------------------------------------------------------------------------
> > comm = PETSc.COMM_WORLD
> > rank = comm.getRank()
> > 
> > dx = 1.0/(xnodes + 1) # xnodes is the number of nodes in the x and y-directions of the grid
> > nnz_max = 5 # max number of non-zero values per row
> > 
> > A = PETSc.Mat()
> > A.create(comm=PETSc.COMM_WORLD)
> > A.setSizes((xnodes*ynodes, xnodes*ynodes))
> > A.setType(PETSc.Mat.Type.AIJ)
> > A.setPreallocationNNZ(nnz_max)
> > 
> > rstart, rend = A.getOwnershipRange()
> > 
> > # Here Anz, Arow and Acol are vectors with size equal to the number of non-zero values 
> > Anz, Arow, Acol = build_nonzero_numpy_arrays_using_numba(rstart, rend, nnz_max, dx, xnodes, ynodes)
> > 
> > A.setValues(Arow, Acol, Anz) # <--- This does not work.
> > 
> > A.assemblyBegin()
> > A.assemblyEnd()
> > ksp = PETSc.KSP()
> > ksp.create(comm=A.getComm())
> > ksp.setType(PETSc.KSP.Type.CG)
> > ksp.getPC().setType(PETSc.PC.Type.GAMG)
> > ksp.setOperators(A)
> > ksp.setFromOptions()
> > x, b = A.createVecs()
> > b.set(1.0)
> > 
> > ksp.solve(b, x)
> > 
> > Regards,
> > Erik Kneller, Ph.D.
> > 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20230819/1d6b39c8/attachment.html>


More information about the petsc-users mailing list