[petsc-users] hypre on gpus

Matthew Knepley knepley at gmail.com
Mon Nov 22 09:25:39 CST 2021


On Mon, Nov 22, 2021 at 10:20 AM Karthikeyan Chockalingam - STFC UKRI <
karthikeyan.chockalingam at stfc.ac.uk> wrote:

> Thanks Stefano,
>

I just did a clean build with no problems. Can you try a clean build of
main?

  Thanks,

     Matt


>
>
> I have another build without the --download-hypre-commit=origin/hypre_petsc
> but that gives a different error.
>
>
>
> [kchockalingam at glados tutorials]$ ./ex4 -ksp_view -ksp_type cg -mat_type
> hypre -pc_type hypre
>
> *[0]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------*
>
> [0]PETSC ERROR: Error in external library
>
> [0]PETSC ERROR: Error in HYPRE_IJMatrixAssemble(): error code 12
>
> [0]PETSC ERROR: See https://www.mcs.anl.gov/petsc/documentation/faq.html
> for trouble shooting.
>
> [0]PETSC ERROR: Petsc Release Version 3.15.3, Aug 06, 2021
>
> [0]PETSC ERROR: ./ex4 on a  named glados.dl.ac.uk by kchockalingam Mon
> Nov 22 15:07:46 2021
>
> [0]PETSC ERROR: Configure options
> --package-prefix-hash=/home/kchockalingam/petsc-hash-pkgs
> --with-make-test-np=2 COPTFLAGS="-g -O3 -fno-omit-frame-pointer"
> FOPTFLAGS="-g -O3 -fno-omit-frame-pointer" CXXOPTFLAGS="-g -O3
> -fno-omit-frame-pointer" --with-cuda=1 --with-cuda-arch=70
> --with-blaslapack=1 --with-cuda-dir=/apps/packages/cuda/10.1/
> --with-mpi-dir=/apps/packages/gcc/7.3.0/openmpi/3.1.2 --download-hypre=1
> --download-hypre-configure-arguments=HYPRE_CUDA_SM=70 --with-debugging=no
> PETSC_ARCH=arch-ci-linux-cuda11-double
>
> [0]PETSC ERROR: #1 MatAssemblyEnd_HYPRE() at
> /home/kchockalingam/tools/petsc-3.15.3/src/mat/impls/hypre/mhypre.c:1212
>
> [0]PETSC ERROR: #2 MatAssemblyEnd() at
> /home/kchockalingam/tools/petsc-3.15.3/src/mat/interface/matrix.c:5652
>
> [0]PETSC ERROR: #3 main() at ex4.c:84
>
> [0]PETSC ERROR: PETSc Option Table entries:
>
> [0]PETSC ERROR: -ksp_type cg
>
> [0]PETSC ERROR: -ksp_view
>
> [0]PETSC ERROR: -mat_type hypre
>
> [0]PETSC ERROR: -pc_type hypre
>
> *[0]PETSC ERROR: ----------------End of Error Message -------send entire
> error message to petsc-maint at mcs.anl.gov----------*
>
> --------------------------------------------------------------------------
>
>
>
> Best,
>
> Karthik.
>
>
>
> *From: *Stefano Zampini <stefano.zampini at gmail.com>
> *Date: *Monday, 22 November 2021 at 14:46
> *To: *Matthew Knepley <knepley at gmail.com>
> *Cc: *"Chockalingam, Karthikeyan (STFC,DL,HC)" <
> karthikeyan.chockalingam at stfc.ac.uk>, "petsc-users at mcs.anl.gov" <
> petsc-users at mcs.anl.gov>
> *Subject: *Re: [petsc-users] hypre on gpus
>
>
>
> You don't need to specify the HYPRE commit. Remove  --download-hypre-commit=origin/hypre_petsc
> from the configuration options
>
>
>
> Il giorno lun 22 nov 2021 alle ore 17:29 Matthew Knepley <
> knepley at gmail.com> ha scritto:
>
> On Mon, Nov 22, 2021 at 8:50 AM Karthikeyan Chockalingam - STFC UKRI <
> karthikeyan.chockalingam at stfc.ac.uk> wrote:
>
> Hi Matt,
>
>
>
> Below is the entire error message:
>
>
>
> I cannot reproduce this:
>
>
>
> main $:/PETSc3/petsc/petsc-dev/src/ksp/ksp/tutorials$ ./ex4 -ksp_view
> -ksp_type cg -mat_type hypre -pc_type hypre
> KSP Object: 1 MPI processes
>   type: cg
>   maximum iterations=10000, initial guess is zero
>   tolerances:  relative=0.000138889, absolute=1e-50, divergence=10000.
>   left preconditioning
>   using PRECONDITIONED norm type for convergence test
> PC Object: 1 MPI processes
>   type: hypre
>     HYPRE BoomerAMG preconditioning
>       Cycle type V
>       Maximum number of levels 25
>       Maximum number of iterations PER hypre call 1
>       Convergence tolerance PER hypre call 0.
>       Threshold for strong coupling 0.25
>       Interpolation truncation factor 0.
>       Interpolation: max elements per row 0
>       Number of levels of aggressive coarsening 0
>       Number of paths for aggressive coarsening 1
>       Maximum row sums 0.9
>       Sweeps down         1
>       Sweeps up           1
>       Sweeps on coarse    1
>       Relax down          symmetric-SOR/Jacobi
>       Relax up            symmetric-SOR/Jacobi
>       Relax on coarse     Gaussian-elimination
>       Relax weight  (all)      1.
>       Outer relax weight (all) 1.
>       Using CF-relaxation
>       Not using more complex smoothers.
>       Measure type        local
>       Coarsen type        Falgout
>       Interpolation type  classical
>   linear system matrix = precond matrix:
>   Mat Object: 1 MPI processes
>     type: hypre
>     rows=56, cols=56
> Norm of error 8.69801e-05 iterations 2
>
> This is on the 'main' branch.  So either there is some bug in release, or
> something is strange on your end. Since we run Hypre tests for the CI,
>
> I am leaning toward the latter. Can you try the 'main' branch? We will
> have to use this anyway if we want any fixes.
>
>
>
>   Thanks,
>
>
>
>     Matt
>
>
>
>  *[0]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------*
>
> [0]PETSC ERROR: Object is in wrong state
>
> [0]PETSC ERROR: Must call MatXXXSetPreallocation(), MatSetUp() or the
> matrix has not yet been factored on argument 1 "mat" before
> MatGetOwnershipRange()
>
> [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting.
>
> [0]PETSC ERROR: Petsc Release Version 3.16.0, Sep 29, 2021
>
> [0]PETSC ERROR: ./ex4 on a  named sqg2b4.bullx by kxc07-lxm25 Mon Nov 22
> 11:33:41 2021
>
> [0]PETSC ERROR: Configure options
> --prefix=/lustre/scafellpike/local/apps/gcc7/petsc/3.16.0-cuda11.2
> --with-debugging=yes
> --with-blaslapack-dir=/lustre/scafellpike/local/apps/intel/intel_cs/2018.0.128/mkl
> --with-cuda=1 --with-cuda-arch=70 --download-hypre=yes
> --download-hypre-configure-arguments=HYPRE_CUDA_SM=70
> --download-hypre-commit=origin/hypre_petsc --with-shared-libraries=1
> --known-mpi-shared-libraries=1 --with-cc=mpicc --with-cxx=mpicxx
> -with-fc=mpif90
>
> [0]PETSC ERROR: #1 MatGetOwnershipRange() at
> /netfs/smain01/scafellpike/local/package_build/build/rja87-build/petsc-cuda-3.16.0/src/mat/interface/matrix.c:6784
>
> [0]PETSC ERROR: #2 main() at ex4.c:40
>
> [0]PETSC ERROR: PETSc Option Table entries:
>
> [0]PETSC ERROR: -ksp_type cg
>
> [0]PETSC ERROR: -ksp_view
>
> [0]PETSC ERROR: -mat_type hypre
>
> [0]PETSC ERROR: -pc_type hypre
>
> [0]PETSC ERROR: -use_gpu_aware_mpi 0
>
> *[0]PETSC ERROR: ----------------End of Error Message -------send entire
> error message to petsc-maint at mcs.anl.gov----------*
>
> --------------------------------------------------------------------------
>
>
>
> I have also attached the make.log. Thank you for having a look.
>
>
>
> Best,
>
> Karthik.
>
>
>
> *From: *Matthew Knepley <knepley at gmail.com>
> *Date: *Monday, 22 November 2021 at 13:41
> *To: *"Chockalingam, Karthikeyan (STFC,DL,HC)" <
> karthikeyan.chockalingam at stfc.ac.uk>
> *Cc: *Mark Adams <mfadams at lbl.gov>, "petsc-users at mcs.anl.gov" <
> petsc-users at mcs.anl.gov>
> *Subject: *Re: [petsc-users] hypre on gpus
>
>
>
> On Mon, Nov 22, 2021 at 6:47 AM Karthikeyan Chockalingam - STFC UKRI <
> karthikeyan.chockalingam at stfc.ac.uk> wrote:
>
> Thank you for your response. I tried to run the same example
>
>
>
> petsc/src/ksp/ksp/tutorials$  *./ex4 -ksp_type cg -mat_type hypre
> -ksp_view -pc_type hypre*
>
>
>
> but it crashed with the below error
>
>
>
> *[0]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------*
>
> [0]PETSC ERROR: Object is in wrong state
>
> [0]PETSC ERROR: Must call MatXXXSetPreallocation(), MatSetUp() or the
> matrix has not yet been factored on argument 1 "mat" before
> MatGetOwnershipRange()
>
>
>
> Hi Karthik,
>
>
>
> Please do not clip the error message. We need the entire output. This
> seems strange that you would get a logic error,
>
> since that should be the same on any architecture. So could you also send
> the make.log?
>
>
>
>   Thanks,
>
>
>
>     Matt
>
>
>
> Below are the options used to configure hypre with cuda support. Do you
> spot any mistakes?
>
>
>
> --with-blaslapack-dir=/lustre/scafellpike/local/apps/intel/intel_cs/2018.0.128/mkl
>
>
> --with-cuda=1
>
> --with-cuda-arch=70
>
> --download-hypre=yes
>
> --download-hypre-configure-arguments=HYPRE_CUDA_SM=70
>
> --download-hypre-commit=origin/hypre_petsc
>
> --with-shared-libraries=1
>
> --known-mpi-shared-libraries=1
>
> --with-cc=mpicc
>
> --with-cxx=mpicxx
>
> -with-fc=mpif90
>
>
>
> Best,
>
> Karthik.
>
>
>
>
>
> *From: *Mark Adams <mfadams at lbl.gov>
> *Date: *Friday, 19 November 2021 at 16:31
> *To: *"Chockalingam, Karthikeyan (STFC,DL,HC)" <
> karthikeyan.chockalingam at stfc.ac.uk>
> *Cc: *"petsc-users at mcs.anl.gov" <petsc-users at mcs.anl.gov>
> *Subject: *Re: [petsc-users] hypre on gpus
>
>
>
> You should run with -options_left to check that your options are being
> used. It may be -mat_type hypre.
>
>
>
> I have tested this:
>
>
>
> petsc/src/ksp/ksp/tutorials$ *srun -n2 ./ex4 -ksp_type cg -mat_type hypre
> -ksp_view -pc_type hypre*
>
>
>
> You can add -log_view and that will print performance data for each method
> like KSPSolve.
>
>
>
> If you are configured with a GPU there will be some extra columns that
> give the percent of flops on the GPU.
>
>
>
> In the past hypre has not registered flops with us, but I do get flops
> from hypre now, and -ksp_view showed that it did indeed use hypre.
>
> I saw that the flops were 100% GPU in hypre.
>
>
>
>
>
> On Fri, Nov 19, 2021 at 10:47 AM Karthikeyan Chockalingam - STFC UKRI <
> karthikeyan.chockalingam at stfc.ac.uk> wrote:
>
> Hello,
>
>
>
> I tried to solve a 3D Poisson (ksp/tutorial/ex45) using -pc_type hypre on
> gpus
>
>
>
> ./ex45 -da_grid_x 128 -da_grid_y 128 -da_grid_z 128 -dm_mat_type hypre
> -dm_vec_type cuda -ksp_type cg -pc_type hypre -pc_hypre_type  boomeramg
> -ksp_monitor -log_view
>
>
>
> I profiled the run using NSYS -  attached you find all the relevant files.
>
> Looking at the profile I doubt if hypre is running on gpus. The cuda
> kernels are barely active.
>
> I don’t see any cuda kernel relevant to solve.
>
> Is my assessment correct? How can I verify if hypre indeed is running on
> gpus?
>
>
>
> Best,
>
> Karthik.
>
>
>
> *From: *Mark Adams <mfadams at lbl.gov>
> *Date: *Friday, 8 October 2021 at 18:47
> *To: *"Chockalingam, Karthikeyan (STFC,DL,HC)" <
> karthikeyan.chockalingam at stfc.ac.uk>
> *Cc: *"petsc-users at mcs.anl.gov" <petsc-users at mcs.anl.gov>
> *Subject: *Re: [petsc-users] hypre on gpus
>
>
>
> I think you would want to use 'cuda' vec_type, but I .
>
> You might ask Hypre how one verifies that the GPU is used.
>
> Mark
>
>
>
> On Fri, Oct 8, 2021 at 1:35 PM Karthikeyan Chockalingam - STFC UKRI <
> karthikeyan.chockalingam at stfc.ac.uk> wrote:
>
> Yes, I used it for both cpu and gpu. Is that not okay?
>
>
>
> For gpu: -dm_mat_type hypre -dm_vec_type mpicuda
>
>
>
> For cpu:  -dm_mat_type hypre -dm_vec_type mpi
>
>
>
> *From: *Mark Adams <mfadams at lbl.gov>
> *Date: *Friday, 8 October 2021 at 18:28
> *To: *"Chockalingam, Karthikeyan (STFC,DL,HC)" <
> karthikeyan.chockalingam at stfc.ac.uk>
> *Cc: *"petsc-users at mcs.anl.gov" <petsc-users at mcs.anl.gov>
> *Subject: *Re: [petsc-users] hypre on gpus
>
>
>
> Did you use -dm_mat_type hypre on the GPU case ?
>
>
>
> On Fri, Oct 8, 2021 at 12:19 PM Karthikeyan Chockalingam - STFC UKRI <
> karthikeyan.chockalingam at stfc.ac.uk> wrote:
>
> I tried a different exercise ran the same problem on two cpu cores and on
> two gpu:
>
>
>
> On gpu
>
>
>
> PCApply                6 1.0 6.0335e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 6.0e+00 15  0  0  0  1  15  0  0  0  1     0       0      0 0.00e+00    5
> 9.65e+01  0
>
>
>
> and on cpu
>
>
>
> PCApply                6 1.0 5.6348e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00 16  0  0  0  0  16  0  0  0  0     0
>
>
>
> timings again are close but gpu version did a reduction 6.0e+00 but the
> cpu version did not 0.0e+00.
>
> I am not sure if that is any indication if hypre ran on gpus?
>
>
>
> Thanks,
>
> Karthik.
>
>
>
>
>
> *From: *Mark Adams <mfadams at lbl.gov>
> *Date: *Friday, 8 October 2021 at 16:36
> *To: *"Chockalingam, Karthikeyan (STFC,DL,HC)" <
> karthikeyan.chockalingam at stfc.ac.uk>
> *Cc: *"petsc-users at mcs.anl.gov" <petsc-users at mcs.anl.gov>
> *Subject: *Re: [petsc-users] hypre on gpus
>
>
>
>
>
>
>
> On Fri, Oct 8, 2021 at 10:29 AM Karthikeyan Chockalingam - STFC UKRI <
> karthikeyan.chockalingam at stfc.ac.uk> wrote:
>
> The PCApply timing on
>
>
>
> gpu
>
>
>
> PCApply                6 1.0 1.0235e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00 39  0  0  0  0  39  0  0  0  0     0       0      0 0.00e+00    0
> 0.00e+00  0
>
>
>
> and cpu
>
>
>
> PCApply                6 1.0 1.0242e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00 41  0  0  0  0  41  0  0  0  0     0
>
>
>
>
>
> You don't have GPUs. probably.
>
> Use -dm_mat_type hypre.
>
>
>
> are close. It is hard for me tell if hypre on gpu is on or not.
>
>
>
> Best,
>
> Karthik.
>
>
>
>
>
> *From: *"Chockalingam, Karthikeyan (STFC,DL,HC)" <
> karthikeyan.chockalingam at stfc.ac.uk>
> *Date: *Friday, 8 October 2021 at 14:55
> *To: *Mark Adams <mfadams at lbl.gov>
> *Cc: *"petsc-users at mcs.anl.gov" <petsc-users at mcs.anl.gov>
> *Subject: *Re: [petsc-users] hypre on gpus
>
>
>
> Thanks Mark, I will try your recommendations.
>
> Should I also change -dm_vec_type to hypre currently I have it as mpicuda?
>
>
>
> Karthik.
>
>
>
>
>
> *From: *Mark Adams <mfadams at lbl.gov>
> *Date: *Friday, 8 October 2021 at 14:33
> *To: *"Chockalingam, Karthikeyan (STFC,DL,HC)" <
> karthikeyan.chockalingam at stfc.ac.uk>
> *Cc: *"petsc-users at mcs.anl.gov" <petsc-users at mcs.anl.gov>
> *Subject: *Re: [petsc-users] hypre on gpus
>
>
>
> Hypre does not record its flops with PETSc's timers.
>
> Configure with and without CUDA and see if the timings change in PCApply.
>
> Hypre does not dynamically switch between CUDA and CPU solves at
> this time, but you want to use -dm_mat_type hypre.
>
> Mark
>
>
>
> On Fri, Oct 8, 2021 at 6:59 AM Karthikeyan Chockalingam - STFC UKRI <
> karthikeyan.chockalingam at stfc.ac.uk> wrote:
>
> Hello,
>
>
>
> I am trying to run ex45 (in KSP tutorial) using hypre on gpus. I have
> attached the python configuration file and -log_view output from running
> the below command options
>
>
>
> mpirun -n 2 ./ex45 -log_view -da_grid_x 169 -da_grid_y 169 -da_grid_z 169
> -dm_mat_type mpiaijcusparse -dm_vec_type mpicuda -ksp_type gmres -pc_type
> hypre -pc_hypre_type  boomeramg -ksp_gmres_restart 31
> -pc_hypre_boomeramg_strong_threshold 0.7  -ksp_monitor
>
>
>
> The problem was solved and converged but from the output file I suspect
> hypre is not running on gpus as PCApply and DMCreate does *not* record
> any gpu Mflop/s. However, some events such KSPSolve, MatMult etc are
> running on gpus.
>
>
>
> Can you please let me know if I need to add any extra flag to the attached
> arch-ci-linux-cuda11-double-xx.py script file to get hypre working on gpus?
>
>
>
> Thanks,
>
> Karthik.
>
>
>
>
>
> This email and any attachments are intended solely for the use of the
> named recipients. If you are not the intended recipient you must not use,
> disclose, copy or distribute this email or any of its attachments and
> should notify the sender immediately and delete this email from your
> system. UK Research and Innovation (UKRI) has taken every reasonable
> precaution to minimise risk of this email or any attachments containing
> viruses or malware but the recipient should carry out its own virus and
> malware checks before opening the attachments. UKRI does not accept any
> liability for any losses or damages which the recipient may sustain due to
> presence of any viruses.
>
>
>
>
> --
>
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
>
>
> https://www.cse.buffalo.edu/~knepley/
> <http://www.cse.buffalo.edu/~knepley/>
>
>
>
>
> --
>
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
>
>
> https://www.cse.buffalo.edu/~knepley/
> <http://www.cse.buffalo.edu/~knepley/>
>
>
>
> --
>
> Stefano
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20211122/678158cc/attachment-0001.html>


More information about the petsc-users mailing list