[petsc-users] hypre on gpus

Stefano Zampini stefano.zampini at gmail.com
Mon Nov 22 08:45:56 CST 2021


You don't need to specify the HYPRE commit. Remove
--download-hypre-commit=origin/hypre_petsc
from the configuration options

Il giorno lun 22 nov 2021 alle ore 17:29 Matthew Knepley <knepley at gmail.com>
ha scritto:

> On Mon, Nov 22, 2021 at 8:50 AM Karthikeyan Chockalingam - STFC UKRI <
> karthikeyan.chockalingam at stfc.ac.uk> wrote:
>
>> Hi Matt,
>>
>>
>>
>> Below is the entire error message:
>>
>
> I cannot reproduce this:
>
> main $:/PETSc3/petsc/petsc-dev/src/ksp/ksp/tutorials$ ./ex4 -ksp_view
> -ksp_type cg -mat_type hypre -pc_type hypre
> KSP Object: 1 MPI processes
>   type: cg
>   maximum iterations=10000, initial guess is zero
>   tolerances:  relative=0.000138889, absolute=1e-50, divergence=10000.
>   left preconditioning
>   using PRECONDITIONED norm type for convergence test
> PC Object: 1 MPI processes
>   type: hypre
>     HYPRE BoomerAMG preconditioning
>       Cycle type V
>       Maximum number of levels 25
>       Maximum number of iterations PER hypre call 1
>       Convergence tolerance PER hypre call 0.
>       Threshold for strong coupling 0.25
>       Interpolation truncation factor 0.
>       Interpolation: max elements per row 0
>       Number of levels of aggressive coarsening 0
>       Number of paths for aggressive coarsening 1
>       Maximum row sums 0.9
>       Sweeps down         1
>       Sweeps up           1
>       Sweeps on coarse    1
>       Relax down          symmetric-SOR/Jacobi
>       Relax up            symmetric-SOR/Jacobi
>       Relax on coarse     Gaussian-elimination
>       Relax weight  (all)      1.
>       Outer relax weight (all) 1.
>       Using CF-relaxation
>       Not using more complex smoothers.
>       Measure type        local
>       Coarsen type        Falgout
>       Interpolation type  classical
>   linear system matrix = precond matrix:
>   Mat Object: 1 MPI processes
>     type: hypre
>     rows=56, cols=56
> Norm of error 8.69801e-05 iterations 2
>
> This is on the 'main' branch.  So either there is some bug in release, or
> something is strange on your end. Since we run Hypre tests for the CI,
> I am leaning toward the latter. Can you try the 'main' branch? We will
> have to use this anyway if we want any fixes.
>
>   Thanks,
>
>     Matt
>
>
>>  *[0]PETSC ERROR: --------------------- Error Message
>> --------------------------------------------------------------*
>>
>> [0]PETSC ERROR: Object is in wrong state
>>
>> [0]PETSC ERROR: Must call MatXXXSetPreallocation(), MatSetUp() or the
>> matrix has not yet been factored on argument 1 "mat" before
>> MatGetOwnershipRange()
>>
>> [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting.
>>
>> [0]PETSC ERROR: Petsc Release Version 3.16.0, Sep 29, 2021
>>
>> [0]PETSC ERROR: ./ex4 on a  named sqg2b4.bullx by kxc07-lxm25 Mon Nov 22
>> 11:33:41 2021
>>
>> [0]PETSC ERROR: Configure options
>> --prefix=/lustre/scafellpike/local/apps/gcc7/petsc/3.16.0-cuda11.2
>> --with-debugging=yes
>> --with-blaslapack-dir=/lustre/scafellpike/local/apps/intel/intel_cs/2018.0.128/mkl
>> --with-cuda=1 --with-cuda-arch=70 --download-hypre=yes
>> --download-hypre-configure-arguments=HYPRE_CUDA_SM=70
>> --download-hypre-commit=origin/hypre_petsc --with-shared-libraries=1
>> --known-mpi-shared-libraries=1 --with-cc=mpicc --with-cxx=mpicxx
>> -with-fc=mpif90
>>
>> [0]PETSC ERROR: #1 MatGetOwnershipRange() at
>> /netfs/smain01/scafellpike/local/package_build/build/rja87-build/petsc-cuda-3.16.0/src/mat/interface/matrix.c:6784
>>
>> [0]PETSC ERROR: #2 main() at ex4.c:40
>>
>> [0]PETSC ERROR: PETSc Option Table entries:
>>
>> [0]PETSC ERROR: -ksp_type cg
>>
>> [0]PETSC ERROR: -ksp_view
>>
>> [0]PETSC ERROR: -mat_type hypre
>>
>> [0]PETSC ERROR: -pc_type hypre
>>
>> [0]PETSC ERROR: -use_gpu_aware_mpi 0
>>
>> *[0]PETSC ERROR: ----------------End of Error Message -------send entire
>> error message to petsc-maint at mcs.anl.gov----------*
>>
>> --------------------------------------------------------------------------
>>
>>
>>
>> I have also attached the make.log. Thank you for having a look.
>>
>>
>>
>> Best,
>>
>> Karthik.
>>
>>
>>
>> *From: *Matthew Knepley <knepley at gmail.com>
>> *Date: *Monday, 22 November 2021 at 13:41
>> *To: *"Chockalingam, Karthikeyan (STFC,DL,HC)" <
>> karthikeyan.chockalingam at stfc.ac.uk>
>> *Cc: *Mark Adams <mfadams at lbl.gov>, "petsc-users at mcs.anl.gov" <
>> petsc-users at mcs.anl.gov>
>> *Subject: *Re: [petsc-users] hypre on gpus
>>
>>
>>
>> On Mon, Nov 22, 2021 at 6:47 AM Karthikeyan Chockalingam - STFC UKRI <
>> karthikeyan.chockalingam at stfc.ac.uk> wrote:
>>
>> Thank you for your response. I tried to run the same example
>>
>>
>>
>> petsc/src/ksp/ksp/tutorials$  *./ex4 -ksp_type cg -mat_type hypre
>> -ksp_view -pc_type hypre*
>>
>>
>>
>> but it crashed with the below error
>>
>>
>>
>> *[0]PETSC ERROR: --------------------- Error Message
>> --------------------------------------------------------------*
>>
>> [0]PETSC ERROR: Object is in wrong state
>>
>> [0]PETSC ERROR: Must call MatXXXSetPreallocation(), MatSetUp() or the
>> matrix has not yet been factored on argument 1 "mat" before
>> MatGetOwnershipRange()
>>
>>
>>
>> Hi Karthik,
>>
>>
>>
>> Please do not clip the error message. We need the entire output. This
>> seems strange that you would get a logic error,
>>
>> since that should be the same on any architecture. So could you also send
>> the make.log?
>>
>>
>>
>>   Thanks,
>>
>>
>>
>>     Matt
>>
>>
>>
>> Below are the options used to configure hypre with cuda support. Do you
>> spot any mistakes?
>>
>>
>>
>> --with-blaslapack-dir=/lustre/scafellpike/local/apps/intel/intel_cs/2018.0.128/mkl
>>
>>
>> --with-cuda=1
>>
>> --with-cuda-arch=70
>>
>> --download-hypre=yes
>>
>> --download-hypre-configure-arguments=HYPRE_CUDA_SM=70
>>
>> --download-hypre-commit=origin/hypre_petsc
>>
>> --with-shared-libraries=1
>>
>> --known-mpi-shared-libraries=1
>>
>> --with-cc=mpicc
>>
>> --with-cxx=mpicxx
>>
>> -with-fc=mpif90
>>
>>
>>
>> Best,
>>
>> Karthik.
>>
>>
>>
>>
>>
>> *From: *Mark Adams <mfadams at lbl.gov>
>> *Date: *Friday, 19 November 2021 at 16:31
>> *To: *"Chockalingam, Karthikeyan (STFC,DL,HC)" <
>> karthikeyan.chockalingam at stfc.ac.uk>
>> *Cc: *"petsc-users at mcs.anl.gov" <petsc-users at mcs.anl.gov>
>> *Subject: *Re: [petsc-users] hypre on gpus
>>
>>
>>
>> You should run with -options_left to check that your options are being
>> used. It may be -mat_type hypre.
>>
>>
>>
>> I have tested this:
>>
>>
>>
>> petsc/src/ksp/ksp/tutorials$ *srun -n2 ./ex4 -ksp_type cg -mat_type
>> hypre -ksp_view -pc_type hypre*
>>
>>
>>
>> You can add -log_view and that will print performance data for each
>> method like KSPSolve.
>>
>>
>>
>> If you are configured with a GPU there will be some extra columns that
>> give the percent of flops on the GPU.
>>
>>
>>
>> In the past hypre has not registered flops with us, but I do get flops
>> from hypre now, and -ksp_view showed that it did indeed use hypre.
>>
>> I saw that the flops were 100% GPU in hypre.
>>
>>
>>
>>
>>
>> On Fri, Nov 19, 2021 at 10:47 AM Karthikeyan Chockalingam - STFC UKRI <
>> karthikeyan.chockalingam at stfc.ac.uk> wrote:
>>
>> Hello,
>>
>>
>>
>> I tried to solve a 3D Poisson (ksp/tutorial/ex45) using -pc_type hypre on
>> gpus
>>
>>
>>
>> ./ex45 -da_grid_x 128 -da_grid_y 128 -da_grid_z 128 -dm_mat_type hypre
>> -dm_vec_type cuda -ksp_type cg -pc_type hypre -pc_hypre_type  boomeramg
>> -ksp_monitor -log_view
>>
>>
>>
>> I profiled the run using NSYS -  attached you find all the relevant
>> files.
>>
>> Looking at the profile I doubt if hypre is running on gpus. The cuda
>> kernels are barely active.
>>
>> I don’t see any cuda kernel relevant to solve.
>>
>> Is my assessment correct? How can I verify if hypre indeed is running on
>> gpus?
>>
>>
>>
>> Best,
>>
>> Karthik.
>>
>>
>>
>> *From: *Mark Adams <mfadams at lbl.gov>
>> *Date: *Friday, 8 October 2021 at 18:47
>> *To: *"Chockalingam, Karthikeyan (STFC,DL,HC)" <
>> karthikeyan.chockalingam at stfc.ac.uk>
>> *Cc: *"petsc-users at mcs.anl.gov" <petsc-users at mcs.anl.gov>
>> *Subject: *Re: [petsc-users] hypre on gpus
>>
>>
>>
>> I think you would want to use 'cuda' vec_type, but I .
>>
>> You might ask Hypre how one verifies that the GPU is used.
>>
>> Mark
>>
>>
>>
>> On Fri, Oct 8, 2021 at 1:35 PM Karthikeyan Chockalingam - STFC UKRI <
>> karthikeyan.chockalingam at stfc.ac.uk> wrote:
>>
>> Yes, I used it for both cpu and gpu. Is that not okay?
>>
>>
>>
>> For gpu: -dm_mat_type hypre -dm_vec_type mpicuda
>>
>>
>>
>> For cpu:  -dm_mat_type hypre -dm_vec_type mpi
>>
>>
>>
>> *From: *Mark Adams <mfadams at lbl.gov>
>> *Date: *Friday, 8 October 2021 at 18:28
>> *To: *"Chockalingam, Karthikeyan (STFC,DL,HC)" <
>> karthikeyan.chockalingam at stfc.ac.uk>
>> *Cc: *"petsc-users at mcs.anl.gov" <petsc-users at mcs.anl.gov>
>> *Subject: *Re: [petsc-users] hypre on gpus
>>
>>
>>
>> Did you use -dm_mat_type hypre on the GPU case ?
>>
>>
>>
>> On Fri, Oct 8, 2021 at 12:19 PM Karthikeyan Chockalingam - STFC UKRI <
>> karthikeyan.chockalingam at stfc.ac.uk> wrote:
>>
>> I tried a different exercise ran the same problem on two cpu cores and on
>> two gpu:
>>
>>
>>
>> On gpu
>>
>>
>>
>> PCApply                6 1.0 6.0335e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
>> 6.0e+00 15  0  0  0  1  15  0  0  0  1     0       0      0 0.00e+00    5
>> 9.65e+01  0
>>
>>
>>
>> and on cpu
>>
>>
>>
>> PCApply                6 1.0 5.6348e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
>> 0.0e+00 16  0  0  0  0  16  0  0  0  0     0
>>
>>
>>
>> timings again are close but gpu version did a reduction 6.0e+00 but the
>> cpu version did not 0.0e+00.
>>
>> I am not sure if that is any indication if hypre ran on gpus?
>>
>>
>>
>> Thanks,
>>
>> Karthik.
>>
>>
>>
>>
>>
>> *From: *Mark Adams <mfadams at lbl.gov>
>> *Date: *Friday, 8 October 2021 at 16:36
>> *To: *"Chockalingam, Karthikeyan (STFC,DL,HC)" <
>> karthikeyan.chockalingam at stfc.ac.uk>
>> *Cc: *"petsc-users at mcs.anl.gov" <petsc-users at mcs.anl.gov>
>> *Subject: *Re: [petsc-users] hypre on gpus
>>
>>
>>
>>
>>
>>
>>
>> On Fri, Oct 8, 2021 at 10:29 AM Karthikeyan Chockalingam - STFC UKRI <
>> karthikeyan.chockalingam at stfc.ac.uk> wrote:
>>
>> The PCApply timing on
>>
>>
>>
>> gpu
>>
>>
>>
>> PCApply                6 1.0 1.0235e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
>> 0.0e+00 39  0  0  0  0  39  0  0  0  0     0       0      0 0.00e+00    0
>> 0.00e+00  0
>>
>>
>>
>> and cpu
>>
>>
>>
>> PCApply                6 1.0 1.0242e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
>> 0.0e+00 41  0  0  0  0  41  0  0  0  0     0
>>
>>
>>
>>
>>
>> You don't have GPUs. probably.
>>
>> Use -dm_mat_type hypre.
>>
>>
>>
>> are close. It is hard for me tell if hypre on gpu is on or not.
>>
>>
>>
>> Best,
>>
>> Karthik.
>>
>>
>>
>>
>>
>> *From: *"Chockalingam, Karthikeyan (STFC,DL,HC)" <
>> karthikeyan.chockalingam at stfc.ac.uk>
>> *Date: *Friday, 8 October 2021 at 14:55
>> *To: *Mark Adams <mfadams at lbl.gov>
>> *Cc: *"petsc-users at mcs.anl.gov" <petsc-users at mcs.anl.gov>
>> *Subject: *Re: [petsc-users] hypre on gpus
>>
>>
>>
>> Thanks Mark, I will try your recommendations.
>>
>> Should I also change -dm_vec_type to hypre currently I have it as
>> mpicuda?
>>
>>
>>
>> Karthik.
>>
>>
>>
>>
>>
>> *From: *Mark Adams <mfadams at lbl.gov>
>> *Date: *Friday, 8 October 2021 at 14:33
>> *To: *"Chockalingam, Karthikeyan (STFC,DL,HC)" <
>> karthikeyan.chockalingam at stfc.ac.uk>
>> *Cc: *"petsc-users at mcs.anl.gov" <petsc-users at mcs.anl.gov>
>> *Subject: *Re: [petsc-users] hypre on gpus
>>
>>
>>
>> Hypre does not record its flops with PETSc's timers.
>>
>> Configure with and without CUDA and see if the timings change in PCApply.
>>
>> Hypre does not dynamically switch between CUDA and CPU solves at
>> this time, but you want to use -dm_mat_type hypre.
>>
>> Mark
>>
>>
>>
>> On Fri, Oct 8, 2021 at 6:59 AM Karthikeyan Chockalingam - STFC UKRI <
>> karthikeyan.chockalingam at stfc.ac.uk> wrote:
>>
>> Hello,
>>
>>
>>
>> I am trying to run ex45 (in KSP tutorial) using hypre on gpus. I have
>> attached the python configuration file and -log_view output from running
>> the below command options
>>
>>
>>
>> mpirun -n 2 ./ex45 -log_view -da_grid_x 169 -da_grid_y 169 -da_grid_z
>> 169  -dm_mat_type mpiaijcusparse -dm_vec_type mpicuda -ksp_type gmres
>> -pc_type hypre -pc_hypre_type  boomeramg -ksp_gmres_restart 31
>> -pc_hypre_boomeramg_strong_threshold 0.7  -ksp_monitor
>>
>>
>>
>> The problem was solved and converged but from the output file I suspect
>> hypre is not running on gpus as PCApply and DMCreate does *not* record
>> any gpu Mflop/s. However, some events such KSPSolve, MatMult etc are
>> running on gpus.
>>
>>
>>
>> Can you please let me know if I need to add any extra flag to the
>> attached arch-ci-linux-cuda11-double-xx.py script file to get hypre working
>> on gpus?
>>
>>
>>
>> Thanks,
>>
>> Karthik.
>>
>>
>>
>>
>>
>> This email and any attachments are intended solely for the use of the
>> named recipients. If you are not the intended recipient you must not use,
>> disclose, copy or distribute this email or any of its attachments and
>> should notify the sender immediately and delete this email from your
>> system. UK Research and Innovation (UKRI) has taken every reasonable
>> precaution to minimise risk of this email or any attachments containing
>> viruses or malware but the recipient should carry out its own virus and
>> malware checks before opening the attachments. UKRI does not accept any
>> liability for any losses or damages which the recipient may sustain due to
>> presence of any viruses.
>>
>>
>>
>>
>> --
>>
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>>
>>
>> https://www.cse.buffalo.edu/~knepley/
>> <http://www.cse.buffalo.edu/~knepley/>
>>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> <http://www.cse.buffalo.edu/~knepley/>
>


-- 
Stefano
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20211122/0594969f/attachment-0001.html>


More information about the petsc-users mailing list