[petsc-users] Filling non-zero values of a Petsc matrix using numpy arrays with non-zero indices and values (petsc4py)

Erik Kneller erik.kneller at yahoo.com
Sat Aug 19 09:51:56 CDT 2023


 Hi All,
Thank you for the recommendations. The first option provided by Matthew works in petsc4py but required some additional computations to re-organize information. I developed an approach that was easier to build within my current system using the option provided by Hong along with some other posts I found on the user forum. See below for an example of how I integrated Numba and petcs4py so the filling of matrix elements on each processor is done using the optimized machine code produced by Numba (you can eliminate the cleaning step if the number of non-zero elements for each row is well defined). Scaling and overall performance is now satisfactory. 

I do have another question. In order to make this work I create the Petsc matrix 'A' twice, first to get local ownership and second to define the local elements for a particular processor:

Algorithm------------
(1) Create a Petsc matrix 'A' and set size and type(2) Get the local row start and end for matrix 'A'(3) Define the local non-zero coefficients for the rows owned by processor using a Numba JIT-compiled loop and store result in a csr matrix defined using Scipy. 
(4) Redefine Petsc matix 'A' using the the local csr elements define in step 3.(5) Begin and end assembly(6) Define RHS and initial solution vectors(7) Solve system 
 (see code example below for details)
Steps (1) and (4) appear redundant and potentially sub-optimal from a performance perspective (perhaps due to my limited experience with petscy4py). Is there a better approach in terms of elegance and performance? Also, if this is the optimal syntax could someone elaborate on what exactly is occurring behind the seen in terms of memory allocation?

Thank you all again for your guidance and insights.

Modified Version of Dalcin's 2D Poisson Example with Numba
----------------------------------------------------------------------------------
import sys
from typing import Tuple
import numpy.typing as npt
import numpy as np
from numba import njit
import petsc4py
petsc4py.init(sys.argv)
from petsc4py import PETSc
import scipy.sparse as sps


def main() -> None:
    xnodes = 8000
    nnz_max = 10
    
    ynodes = xnodes
    N = xnodes*ynodes
    dx = 1.0/(xnodes + 1)
    
    A = PETSc.Mat()
    A.create(comm=PETSc.COMM_WORLD)
    A.setSizes((xnodes*ynodes, xnodes*ynodes))
    A.setType(PETSc.Mat.Type.AIJ)
    
    rstart, rend = A.getOwnershipRange()
     Ascipy = build_csr_matrix(
        rstart, rend, nnz_max, dx, xnodes, ynodes)
    csr=(
        Ascipy.indptr[rstart:rend+1] - Ascipy.indptr[rstart],
        Ascipy.indices[Ascipy.indptr[rstart]:Ascipy.indptr[rend]],
        Ascipy.data[Ascipy.indptr[rstart]:Ascipy.indptr[rend]]
        )
    A = PETSc.Mat().createAIJ(size=(N,N), csr=csr)    
    A.assemblyBegin()
    A.assemblyEnd()
    
    ksp = PETSc.KSP()
    ksp.create(comm=A.getComm())
    ksp.setType(PETSc.KSP.Type.CG)
    ksp.getPC().setType(PETSc.PC.Type.GAMG)
    
    ksp.setOperators(A)
    ksp.setFromOptions()
    
    x, b = A.createVecs()
    b.set(1.0)
    
    ksp.solve(b, x)
    
    
def build_csr_matrix(
        rstart:int, 
        rend:int, 
        nnz_max:int, 
        dx:float, 
        xnodes:int, 
        ynodes:int
):
    Anz, Arow, Acol = build_nonzero_arrays(
        rstart, rend, nnz_max, dx, xnodes, ynodes
        )
    N = xnodes*ynodes
    Ls = sps.csr_matrix((Anz, (Arow,Acol)), shape=(N,N), dtype=np.float64)
    return Ls
  
  
def build_nonzero_arrays(
        rstart:int, 
        rend:int, 
        nnz_max:int, 
        dx:float, 
        xnodes:int, 
        ynodes:int
) -> Tuple[
    npt.NDArray[np.float64], 
    npt.NDArray[np.float64], 
    npt.NDArray[np.float64]
    ]:
    nrows_local = (rend - rstart) + 1
    Anz_ini = np.zeros((nnz_max*nrows_local), dtype=np.float64)
    Arow_ini = np.zeros((nnz_max*nrows_local), dtype=np.int32)
    Acol_ini = np.zeros((nnz_max*nrows_local), dtype=np.int32)
    icount_nz = define_nonzero_values(
        rstart, rend, dx, xnodes, ynodes, Anz_ini, Arow_ini, Acol_ini
        )
    (
        Anz, Arow, Acol
    ) = clean_nonzero_arrays(icount_nz, Anz_ini, Arow_ini, Acol_ini)
    return Anz, Arow, Acol


@njit
def define_nonzero_values(
        rstart:int, 
        rend:int,
        dx:float,
        xnodes:int,
        ynodes:int,
        Anz:npt.NDArray[np.float64],
        Arow:npt.NDArray[np.int64],
        Acol:npt.NDArray[np.int64]
) -> int:
    """ Fill matrix A
    """
    icount_nz = 0
    for row in range(rstart, rend):
        i, j = index_to_grid(row, xnodes)
        #A[row, row] = 4.0/dx**2
        Anz[icount_nz] = 4.0/dx**2
        Arow[icount_nz] = row
        Acol[icount_nz] = row
        icount_nz += 1
        if i > 0:
            column = row - xnodes
            #A[row, column] = -1.0/dx**2
            Anz[icount_nz] = -1.0/dx**2
            Arow[icount_nz] = row
            Acol[icount_nz] = column
            icount_nz += 1
        if i < xnodes - 1:
            column = row + xnodes
            #A[row, column] = -1.0/dx**2
            Anz[icount_nz] = -1.0/dx**2
            Arow[icount_nz] = row
            Acol[icount_nz] = column
            icount_nz += 1
        if j > 0:
            column = row - 1
            #A[row, column] = -1.0/dx**2
            Anz[icount_nz] = -1.0/dx**2
            Arow[icount_nz] = row
            Acol[icount_nz] = column
            icount_nz += 1
        if j < xnodes - 1:
            column = row + 1
            #A[row, column] = -1.0/dx**2
            Anz[icount_nz] = -1.0/dx**2
            Arow[icount_nz] = row
            Acol[icount_nz] = column
            icount_nz += 1
    return icount_nz


@njit
def clean_nonzero_arrays(
        icount_nz:int, 
        Anz_ini:npt.NDArray[np.float64], 
        Arow_ini:npt.NDArray[np.float64], 
        Acol_ini:npt.NDArray[np.float64]
) -> Tuple[
    npt.NDArray[np.float64], 
    npt.NDArray[np.float64], 
    npt.NDArray[np.float64]
    ]:
    Anz = np.zeros((icount_nz), dtype=np.float64)
    Arow = np.zeros((icount_nz), dtype=np.int32)
    Acol = np.zeros((icount_nz), dtype=np.int32)
    for i in range(icount_nz):
        Anz[i] = Anz_ini[i]
        Arow[i] = Arow_ini[i]
        Acol[i] = Acol_ini[i]
    return Anz, Arow, Acol


@njit
def index_to_grid(r:int, n:int) -> Tuple[int,int]:
    """Convert a row number into a grid point."""
    return (r//n, r%n)


if __name__ == "__main__":
    main()

    On Friday, August 18, 2023 at 11:24:36 AM CDT, Zhang, Hong <hongzhang at anl.gov> wrote:  
 
 You can use this to build a PETSc matrix with the index arrays ai,aj and the value array aa:
    PETSc.Mat().createAIJ(size=(nrows,ncols), csr=(ai,aj,aa))

Hong (Mr.)

> On Aug 17, 2023, at 7:37 AM, Erik Kneller via petsc-users <petsc-users at mcs.anl.gov> wrote:
> 
> Hi All,
> 
> I need to fill non-zero values of a Petsc matrix via petsc4py for the domain defined by A.getOwnershipRange() using three Numpy arrays: (1) array containing row indices of non-zero value, (2) array containing column indices of non-zero values and (3) array containing the non-zero matrix values. How can one perform this type of filling operation in petsc4py? The method A.setValues does not appear to allow this since it only works on an individual matrix element or a block of matrix elements.
> 
> I am using Numpy arrays since they can be computed in loops optimized using Numba on each processor. I also cannot pass the Petsc matrix to a Numba compiled function since type information cannot be inferred. I absolutely need to avoid looping in standard Python to define Petsc matrix elements due to performance issues. I also need to use a standard petscy4py method and avoid writing new C or Fortran wrappers to minimize language complexity.
> 
> Example Algorithm Building on Lisandro Dalcin's 2D Poisson Example:
> ----------------------------------------------------------------------------------------------
> comm = PETSc.COMM_WORLD
> rank = comm.getRank()
> 
> dx = 1.0/(xnodes + 1) # xnodes is the number of nodes in the x and y-directions of the grid
> nnz_max = 5 # max number of non-zero values per row
> 
> A = PETSc.Mat()
> A.create(comm=PETSc.COMM_WORLD)
> A.setSizes((xnodes*ynodes, xnodes*ynodes))
> A.setType(PETSc.Mat.Type.AIJ)
> A.setPreallocationNNZ(nnz_max)
> 
> rstart, rend = A.getOwnershipRange()
> 
> # Here Anz, Arow and Acol are vectors with size equal to the number of non-zero values 
> Anz, Arow, Acol = build_nonzero_numpy_arrays_using_numba(rstart, rend, nnz_max, dx, xnodes, ynodes)
> 
> A.setValues(Arow, Acol, Anz) # <--- This does not work.
> 
> A.assemblyBegin()
> A.assemblyEnd()
> ksp = PETSc.KSP()
> ksp.create(comm=A.getComm())
> ksp.setType(PETSc.KSP.Type.CG)
> ksp.getPC().setType(PETSc.PC.Type.GAMG)
> ksp.setOperators(A)
> ksp.setFromOptions()
> x, b = A.createVecs()
> b.set(1.0)
> 
> ksp.solve(b, x)
> 
> Regards,
> Erik Kneller, Ph.D.
> 

  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20230819/b169564d/attachment-0001.html>


More information about the petsc-users mailing list