[petsc-users] HashMap Error when populating AIJCUSPARSE matrix

Matthew Knepley knepley at gmail.com
Thu Jan 18 08:33:24 CST 2024


On Thu, Jan 18, 2024 at 9:04 AM Yesypenko, Anna <anna at oden.utexas.edu>
wrote:

> Dear Petsc users/developers,
>
> I'm experiencing a bug when using petsc4py with GPU support. It may be my
> mistake in how I set up a AIJCUSPARSE matrix.
> For larger matrices, I sometimes encounter a error in assigning matrix
> values; the error is thrown in PetscHMapIJVQuerySet().
> Here is a minimum snippet that populates a sparse tridiagonal matrix.
>
> ```
> from petsc4py import PETSc
> from scipy.sparse import diags
> import numpy as np
>
> n = int(5e5);
>
> nnz = 3 * np.ones(n, dtype=np.int32); nnz[0] = nnz[-1] = 2
> A = PETSc.Mat(comm=PETSc.COMM_WORLD)
> A.createAIJ(size=[n,n],comm=PETSc.COMM_WORLD,nnz=nnz)
> A.setType('aijcusparse')
> tmp = diags([-1,2,-1],[-1,0,+1],shape=(n,n)).tocsr()
> A.setValuesCSR(tmp.indptr,tmp.indices,tmp.data)
> ####### this is the line where the error is thrown.
> A.assemble()
> ```
>

I don't have scipy installed. Since the matrix is so small, can you
print tmp.indptr,tmp.indices,tmp.data when you run? It seems to be either
bad values there, or something is wrong with passing those pointers.

  Thanks,

     Matt


> The error trace is below:
> ```
> File "petsc4py/PETSc/Mat.pyx", line 2603, in
> petsc4py.PETSc.Mat.setValuesCSR
>   File "petsc4py/PETSc/petscmat.pxi", line 1039, in
> petsc4py.PETSc.matsetvalues_csr
>   File "petsc4py/PETSc/petscmat.pxi", line 1032, in
> petsc4py.PETSc.matsetvalues_ijv
> petsc4py.PETSc.Error: error code 76
> [0] MatSetValues() at
> /work/06368/annayesy/ls6/petsc/src/mat/interface/matrix.c:1497
> [0] MatSetValues_Seq_Hash() at
> /work/06368/annayesy/ls6/petsc/include/../src/mat/impls/aij/seq/seqhashmatsetvalues.h:52
> [0] PetscHMapIJVQuerySet() at
> /work/06368/annayesy/ls6/petsc/include/petsc/private/hashmapijv.h:10
> [0] Error in external library
> [0] [khash] Assertion: `ret >= 0' failed.
> ```
>
> If I run the same script a handful of times, it will run without errors
> eventually.
> Does anyone have insight on why it is behaving this way? I'm running on a
> node with 3x NVIDIA A100 PCIE 40GB.
>
> Thank you!
> Anna
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20240118/a4a24916/attachment.html>


More information about the petsc-users mailing list