[petsc-users] Error while running with -pc_type gamg on NVIDIA GPU

Junchao Zhang junchao.zhang at gmail.com
Wed Sep 13 13:42:08 CDT 2023


Hi, Maruthi,
   I could not reproduce it.  I used the attached slightly modified code
(note the added  PetscFunctionBeginUser;
PetscFunctionReturn(PETSC_SUCCESS); in bc_const_temp_both_sides)
    Could you try it?
--Junchao Zhang


On Wed, Sep 13, 2023 at 1:07 AM Maruthi NH <maruthinh at gmail.com> wrote:

> Hi Junchao Zhang,
>
> I also build petsc with --download-hypre and I too have 2.29.0 version.
> Not sure why I am facing this issue. This is the configuration file I
> created for building petsc.
>
> #!/usr/bin/env python3
>
> import os
> petsc_hash_pkgs=os.path.join(os.getenv('HOME'),'petsc-hash-pkgs')
>
> if __name__ == '__main__':
>   import sys
>   import os
>   sys.path.insert(0, os.path.abspath('config'))
>   import configure
>   configure_options = [
>     '--package-prefix-hash='+petsc_hash_pkgs,
>     '--with-debugging=0',
>     '--with-shared-libraries=0',
>     '--with-cc=mpiicc',
>     '--with-cxx=mpiicpc',
>     '--with-fc=mpiifort',
>     '--with-mpiexec=mpiexec.hydra',
>     '--with-cudac=nvcc', # nvcc-12.2 via ccache
>     '--with-cuda-dir=/usr/local/cuda-12.2',
>     # Intel compilers enable GCC/clangs equivalent of -ffast-math *by
> default*. This is
>     # bananas, so we make sure they use the same model as everyone else
>     'COPTFLAGS=-O3 -fPIE -fp-model=precise',
>     'FOPTFLAGS=-O3 -fPIE -fp-model=precise',
>     'CXXOPTFLAGS=-O3 -fPIE -fp-model=precise',
>     'CUDAOPTFLAGS=-O3',
>     '--with-precision=double',
>     '--with-blaslapack-dir='+os.environ['MKLROOT'],
>     '--with-mkl_pardiso-dir='+os.environ['MKLROOT'],
>     '--with-mkl_cpardiso-dir='+os.environ['MKLROOT'],
>     '--download-hypre',
>   ]
>   configure.petsc_configure(configure_options)
>
> Regards,
> Maruthi
>
> On Tue, Sep 12, 2023 at 10:20 PM Junchao Zhang <junchao.zhang at gmail.com>
> wrote:
>
>> Which version of hypre do you use?  I used petsc's --download-hypre,
>> which automatically installed hypre-2.29.0. I did not see the error.
>>
>> $ ./heat_diff_cu -ksp_type gmres -pc_type hypre -pc_hypre_type boomeramg
>> -use_gpu_aware_mpi 0 -mat_type aijcusparse -vec_type cuda
>> The start and end indices of mat for each rank: 0       100
>> Total time taken for KSP solve: rank: 0 0.0112464
>>
>> --Junchao Zhang
>>
>>
>> On Tue, Sep 12, 2023 at 10:15 AM Maruthi NH <maruthinh at gmail.com> wrote:
>>
>>> Hi Junchao Zhang,
>>>
>>> Thanks for the help. Updating PETSc fixed the problem. However, if I use
>>> boomeramg from hypre as follows, I get a similar error.
>>>
>>> ./heat_diff_cu --ksp_type gmres -pc_type hypre -pc_hypre_type boomeramg
>>> -use_gpu_aware_mpi 0 -mat_type aijcusparse -vec_type cuda
>>>
>>> ** On entry to cusparseCreateCsr() parameter number 5 (csrRowOffsets)
>>> had an illegal value: NULL pointer
>>>
>>>     CUSPARSE ERROR (code = 3, invalid value) at csr_matrix_cuda_utils.c:57
>>>
>>> Regards,
>>> Maruthi
>>>
>>>
>>>
>>>
>>>
>>> On Tue, Sep 12, 2023 at 1:09 AM Junchao Zhang <junchao.zhang at gmail.com>
>>> wrote:
>>>
>>>> Hi, Maruthi,
>>>>   I could run your example on my machine.  BTW,  I added these at the
>>>> end of main() to free petsc objects.
>>>> VecDestroy(&vout);
>>>> VecDestroy(&x);
>>>> VecDestroy(&b);
>>>> VecDestroy(&u);
>>>> MatDestroy(&A);
>>>> VecScatterDestroy(&ctx);
>>>> KSPDestroy(&ksp);
>>>>
>>>> If you use cuda-12.2, maybe the problem is already fixed by MR
>>>> https://gitlab.com/petsc/petsc/-/merge_requests/6828
>>>> You can use petsc/main branch to try.  Note your petsc version is from
>>>> Date: 2023-08-13
>>>>
>>>> Thanks.
>>>> --Junchao Zhang
>>>>
>>>>
>>>> On Mon, Sep 11, 2023 at 12:10 PM Maruthi NH <maruthinh at gmail.com>
>>>> wrote:
>>>>
>>>>> Hi Barry Smith,
>>>>>
>>>>> Thanks for the quick response.
>>>>>
>>>>> Here is the code I used to test PETSc on GPU.
>>>>> This is the command I used to run
>>>>> mpiexec.hydra -n 1 ./heat_diff_cu -Nx 10000000 -ksp_type gmres
>>>>> -mat_type aijcusparse -vec_type cuda -use_gpu_aware_mpi 0 -pc_type gamg
>>>>> -ksp_converged_reason
>>>>>
>>>>> Regards,
>>>>> Maruthi
>>>>>
>>>>>
>>>>>
>>>>> On Sun, Sep 10, 2023 at 11:37 PM Barry Smith <bsmith at petsc.dev> wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>> On Sep 10, 2023, at 5:54 AM, Maruthi NH <maruthinh at gmail.com> wrote:
>>>>>>
>>>>>> Hi all,
>>>>>>
>>>>>> I am trying to accelerate the linear solver with PETSc GPU backend.
>>>>>> For testing I have a simple 1D heat diffusion solver, here are some
>>>>>> observations.
>>>>>> 1. If I use -pc_type gamg it throws the following error
>>>>>>  ** On entry to cusparseCreateCsr() parameter number 5
>>>>>> (csrRowOffsets) had an illegal value: NULL pointer
>>>>>>
>>>>>> [0]PETSC ERROR: --------------------- Error Message
>>>>>> --------------------------------------------------------------
>>>>>> [0]PETSC ERROR: GPU error
>>>>>> [0]PETSC ERROR: cuSPARSE errorcode 3 (CUSPARSE_STATUS_INVALID_VALUE)
>>>>>> : invalid value
>>>>>> [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble
>>>>>> shooting.
>>>>>> [0]PETSC ERROR: Petsc Development GIT revision:
>>>>>> v3.19.4-959-g92f1e92e88  GIT Date: 2023-08-13 19:43:04 +0000
>>>>>>
>>>>>>    Can you share the code that triggers this?
>>>>>>
>>>>>> 2. Default pc ilu takes about 1.2 seconds on a single CPU and it
>>>>>> takes about 105.9 seconds on a GPU. Similar observations with pc_type asm
>>>>>> I have NVIDIA RTX A2000 8GB Laptop GPU
>>>>>>
>>>>>>
>>>>>>   This is expected. The triangular solves sequentialize on the GPU so
>>>>>> naturally are extremely slow since they cannot take advantage of the
>>>>>> massive parallelism of the GPU.
>>>>>>
>>>>>>
>>>>>> 3. What I could be missing? Also, are there any general guidelines
>>>>>> for better GPU performance using PETSc?
>>>>>>
>>>>>> Regards,
>>>>>> Maruthi
>>>>>>
>>>>>>
>>>>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20230913/05ab241e/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: heat_diff_cu.cu
Type: application/octet-stream
Size: 6990 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20230913/05ab241e/attachment.obj>


More information about the petsc-users mailing list