[petsc-users] issue of MatCreateDense in the CUDA codes

Junchao Zhang junchao.zhang at gmail.com
Wed Oct 2 17:12:04 CDT 2024


Hi, Langtian,
   Thanks for the configure.log and I now see what's wrong. Since you
compiled your code with nvcc,  we mistakenly thought petsc was configured
with cuda.
   It is fixed in https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/7909__;!!G_uCfscf7eWS!c2Xrs8q_dNGecGM-HpdhjKUScE6v8qSOSZa0ApqRhZJb2JUEhBq23ydXnFivVh45LPetYtrx-9xUkF_Z7VKbetWpWzW-$ ,
which will be in petsc/release and main.

   Thanks.
--Junchao Zhang


On Wed, Oct 2, 2024 at 3:11 PM 刘浪天 <langtian.liu at icloud.com> wrote:

> Hi Junchao,
>
> I check it, I haven't use cuda when install pure cpu version of petsc.
> The configure.log has been attached. Thank you for your reply.
>
> Best wishes,
> -------------------- Langtian Liu Institute for Theorectical Physics,
> Justus-Liebig-University Giessen Heinrich-Buff-Ring 16, 35392 Giessen
> Germany email: langtian.liu at icloud.com Tel: (+49)641 99 33342
>
> On Oct 2, 2024, at 5:05 PM, Junchao Zhang <junchao.zhang at gmail.com> wrote:
>
>
>
>
> On Wed, Oct 2, 2024 at 3:57 AM 刘浪天 via petsc-users <
> petsc-users at mcs.anl.gov> wrote:
>
>> Hi all,
>>
>> I am using the PETSc and SLEPc to solve the Faddeev equation of baryons.
>> I encounter a problem of function MatCreateDense when changing from CPU to
>> CPU-GPU computations.
>> At first, I write the codes in purely CPU computation in the following
>> way and it works.
>> ```
>>
>> Eigen::MatrixXcd H_KER
>> *;*Eigen::MatrixXcd G0*;*
>>
>> printf("\nCompute the propagator matrix.\n")
>> *;*prop_matrix_nucleon_sc_av(Mn*, *pp_nodes*, *cos1_nodes)
>> *;*printf("\nCompute the propagator matrix done.\n")
>> *;*printf("\nCompute the kernel matrix.\n")
>> *;*bse_kernel_nucleon_sc_av(Mn*, *pp_nodes*, *pp_weights*, *cos1_nodes*, *cos1_weights)
>> *;*printf("\nCompute the kernel matrix done.\n")
>> *;*printf("\nCompute the full kernel matrix by multiplying kernel and propagator matrix.\n")
>> *;*MatrixXcd kernel_temp = H_KER * G0
>> *;*printf("\nCompute the full kernel matrix done.\n")
>>
>> *;*// Solve the eigen system with SLEPc
>> printf("\nSolve the eigen system in the rest frame.\n")
>> *;*// Get the size of the Eigen matrix
>> int nRows = (int) kernel_temp.rows()
>> *;*int nCols = (int) kernel_temp.cols()*;*
>>
>> // Create PETSc matrix and share the data of kernel_temp
>> Mat kernel
>> *;*PetscCall(MatCreateDense(PETSC_COMM_WORLD*, *PETSC_DECIDE*, *PETSC_DECIDE*, *nRows*, *nCols*, *kernel_temp.data()*, *&kernel))
>> *;*PetscCall(MatAssemblyBegin(kernel*, *MAT_FINAL_ASSEMBLY))
>> *;*PetscCall(MatAssemblyEnd(kernel*, *MAT_FINAL_ASSEMBLY))*;*
>>
>> ```
>> Now I change to compute the propagator and kernel matrices in GPU and
>> then compute the largest eigenvalues in CPU using SLEPc in the ways below.
>> ```
>>
>> cuDoubleComplex *h_propmat
>> *;*cuDoubleComplex *h_kernelmat*;*
>>
>> int dim = EIGHT * NP * NZ
>> *;*printf("\nCompute the propagator matrix.\n")
>> *;*prop_matrix_nucleon_sc_av_cuda(Mn*, *pp_nodes.data()*, *cos1_nodes.data())
>> *;*printf("\nCompute the propagator matrix done.\n")
>> *;*printf("\nCompute the kernel matrix.\n")
>> *;*kernel_matrix_nucleon_sc_av_cuda(Mn*, *pp_nodes.data()*, *pp_weights.data()*, *cos1_nodes.data()*, *cos1_weights.data())
>> *;*printf("\nCompute the kernel matrix done.\n")
>> *;*printf("\nCompute the full kernel matrix by multiplying kernel and propagator matrix.\n")
>> *;*// Map the raw arrays to Eigen matrices (column-major order)
>> auto *h_kernel_temp = new cuDoubleComplex [dim*dim]
>> *;*matmul_cublas_cuDoubleComplex(h_kernelmat*,*h_propmat*,*h_kernel_temp*,*dim*,*dim*,*dim)
>> *;*printf("\nCompute the full kernel matrix done.\n")
>>
>> *;*// Solve the eigen system with SLEPc
>> printf("\nSolve the eigen system in the rest frame.\n")
>> *;*int nRows = dim
>> *;*int nCols = dim*;*
>>
>> // Create PETSc matrix and share the data of kernel_temp
>> Mat kernel
>> *;*auto* h_kernel = (std::complex<double>*)(h_kernel_temp)
>> *;*PetscCall(MatCreateDense(PETSC_COMM_WORLD*, *PETSC_DECIDE*, *PETSC_DECIDE*, *nRows*, *nCols*, *h_kernel_temp*, *&kernel))
>> *;*PetscCall(MatAssemblyBegin(kernel*, **MAT_FINAL_ASSEMBLY*))
>> *;*PetscCall(MatAssemblyEnd(kernel*, **MAT_FINAL_ASSEMBLY*))*;*
>>
>> But in this case, the compiler told me that the MatCreateDense function
>> uses the data pointer as type of "thrust::complex<double>" instead of
>> "std::complex<double>".
>> I am sure I only configured and install PETSc in purely CPU without GPU
>> and this codes are written in the host function.
>>
> Please double check that your PETSc was purely CPU configured.  You can
> find it at the end of your configure.log to see if petsc is configured with
> CUDA.
> Since *thrust::complex<double> *is a result of a petsc/cuda
> configuration,  I have this doubt.
>
>
>
>>
>> Why the function changes its behavior? Did you also meet this problem
>> when writing the cuda codes and how to solve this problem.
>> I tried to copy the data to a new thrust::complex<double> matrix but this
>> is very time consuming since my matrix is very big. Is there a way to
>> create the Mat from the original data without changing the data type to
>> thrust::complex<double> in the cuda applications? Any response will be
>> appreciated. Thank you!
>>
>> Best wishes,
>> Langtian Liu
>>
>> ------
>> Institute for Theorectical Physics, Justus-Liebig-University Giessen
>> Heinrich-Buff-Ring 16, 35392 Giessen Germany
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20241002/bd0e8d2c/attachment-0001.html>


More information about the petsc-users mailing list