[petsc-users] Question on usage of PetscMalloc(Re)SetCUDAHost

Sajid Ali sajidsyed2021 at u.northwestern.edu
Tue Aug 25 18:46:31 CDT 2020

Hi PETSc-developers,

Is it valid to allocate matrix values on host for use on a GPU later by
embedding all allocation logic (i.e the code block that calls PetscMalloc1
for values and indices and sets them using MatSetValues) within a section
marked by PetscMalloc(Re)SetCUDAHost ?

My understanding was that PetscMallocSetCUDAHost would set mallocs to be on
the host but I’m getting an error as shown below (for some strange reason
it happens to be the 5th column on the 0th row (if that helps) both when
setting one value at a time and when setting the whole 0th row together):

[sajid at xrmlite cuda]$ mpirun -np 1 ~/packages/pirt/src/pirt -inputfile
PIRT -- Parallel Iterative Reconstruction Tomography
Reading in real data from shepplogan.h5
After loading data, nTau:100, nTheta:50
After detector geometry context initialization
Initialized PIRT
[0]PETSC ERROR: --------------------- Error Message
[0]PETSC ERROR: Error in external library
[0]PETSC ERROR: cuda error 1 (cudaErrorInvalidValue) : invalid argument
https://www.mcs.anl.gov/petsc/documentation/faq.html for trouble
[0]PETSC ERROR: Petsc Development GIT revision:
v3.13.2-947-gc2372adeb2  GIT Date: 2020-08-25 21:07:25 +0000
[0]PETSC ERROR: /home/sajid/packages/pirt/src/pirt on a
arch-linux-c-debug named xrmlite by sajid Tue Aug 25 18:30:55 2020
[0]PETSC ERROR: Configure options --with-hdf5=1 --with-cuda=1
[0]PETSC ERROR: #1 PetscCUDAHostFree() line 14 in
[0]PETSC ERROR: #2 PetscFreeA() line 475 in
[0]PETSC ERROR: #3 MatSeqXAIJFreeAIJ() line 135 in
[0]PETSC ERROR: #4 MatSetValues_SeqAIJ() line 498 in
[0]PETSC ERROR: #5 MatSetValues() line 1392 in
[0]PETSC ERROR: #6 setMatrixElements() line 248 in
[0]PETSC ERROR: #7 construct_matrix() line 91 in
[0]PETSC ERROR: #8 main() line 20 in /home/sajid/packages/pirt/src/pirt.cxx
[0]PETSC ERROR: PETSc Option Table entries:
[0]PETSC ERROR: -inputfile shepplogan.h5
[0]PETSC ERROR: ----------------End of Error Message -------send
entire error message to petsc-maint at mcs.anl.gov----------
MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_SELF
with errorcode 20076.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
[sajid at xrmlite cuda]$

PetscCUDAHostFree is called within the PetscMalloc(Re)SetCUDAHost block as
described earlier which should’ve created valid memory on the host.

Could someone explain if this is the correct approach to take and what the
above error means ?

(PS : I’ve run ksp tutorial-ex2 with -vec_type cuda -mat_type aijcusparse
to test the installation and everything works as expected.)

Thank You,
Sajid Ali | PhD Candidate
Applied Physics
Northwestern University
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20200825/21d7a07f/attachment-0001.html>

More information about the petsc-users mailing list