[petsc-users] hypre on gpus
Matthew Knepley
knepley at gmail.com
Mon Nov 22 12:04:43 CST 2021
On Mon, Nov 22, 2021 at 12:41 PM Karthikeyan Chockalingam - STFC UKRI <
karthikeyan.chockalingam at stfc.ac.uk> wrote:
> @Matthew Knepley <knepley at gmail.com> Yes, it works with main build!
>
That is good. I think we might need to push up the next point release then,
since we have a Hypre bug for which a fix is non-obvious.
Thanks,
Matt
> @Mark Adams <mfadams at lbl.gov> I have attached the log_view output of ex4
> for your reference. But pcapply or pcsetup did not record any gpu flops.
>
> If it is not too much trouble can you send me the log_view output of
> ksp/ksp/tutorial/ex45 using hypre?
>
>
>
> Many thanks!
>
> Karthik.
>
>
>
> *From: *Matthew Knepley <knepley at gmail.com>
> *Date: *Monday, 22 November 2021 at 15:26
> *To: *"Chockalingam, Karthikeyan (STFC,DL,HC)" <
> karthikeyan.chockalingam at stfc.ac.uk>
> *Cc: *Stefano Zampini <stefano.zampini at gmail.com>, "
> petsc-users at mcs.anl.gov" <petsc-users at mcs.anl.gov>
> *Subject: *Re: [petsc-users] hypre on gpus
>
>
>
> On Mon, Nov 22, 2021 at 10:20 AM Karthikeyan Chockalingam - STFC UKRI <
> karthikeyan.chockalingam at stfc.ac.uk> wrote:
>
> Thanks Stefano,
>
>
>
> I just did a clean build with no problems. Can you try a clean build of
> main?
>
>
>
> Thanks,
>
>
>
> Matt
>
>
>
>
>
> I have another build without the --download-hypre-commit=origin/hypre_petsc
> but that gives a different error.
>
>
>
> [kchockalingam at glados tutorials]$ ./ex4 -ksp_view -ksp_type cg -mat_type
> hypre -pc_type hypre
>
> *[0]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------*
>
> [0]PETSC ERROR: Error in external library
>
> [0]PETSC ERROR: Error in HYPRE_IJMatrixAssemble(): error code 12
>
> [0]PETSC ERROR: See https://www.mcs.anl.gov/petsc/documentation/faq.html
> for trouble shooting.
>
> [0]PETSC ERROR: Petsc Release Version 3.15.3, Aug 06, 2021
>
> [0]PETSC ERROR: ./ex4 on a named glados.dl.ac.uk by kchockalingam Mon
> Nov 22 15:07:46 2021
>
> [0]PETSC ERROR: Configure options
> --package-prefix-hash=/home/kchockalingam/petsc-hash-pkgs
> --with-make-test-np=2 COPTFLAGS="-g -O3 -fno-omit-frame-pointer"
> FOPTFLAGS="-g -O3 -fno-omit-frame-pointer" CXXOPTFLAGS="-g -O3
> -fno-omit-frame-pointer" --with-cuda=1 --with-cuda-arch=70
> --with-blaslapack=1 --with-cuda-dir=/apps/packages/cuda/10.1/
> --with-mpi-dir=/apps/packages/gcc/7.3.0/openmpi/3.1.2 --download-hypre=1
> --download-hypre-configure-arguments=HYPRE_CUDA_SM=70 --with-debugging=no
> PETSC_ARCH=arch-ci-linux-cuda11-double
>
> [0]PETSC ERROR: #1 MatAssemblyEnd_HYPRE() at
> /home/kchockalingam/tools/petsc-3.15.3/src/mat/impls/hypre/mhypre.c:1212
>
> [0]PETSC ERROR: #2 MatAssemblyEnd() at
> /home/kchockalingam/tools/petsc-3.15.3/src/mat/interface/matrix.c:5652
>
> [0]PETSC ERROR: #3 main() at ex4.c:84
>
> [0]PETSC ERROR: PETSc Option Table entries:
>
> [0]PETSC ERROR: -ksp_type cg
>
> [0]PETSC ERROR: -ksp_view
>
> [0]PETSC ERROR: -mat_type hypre
>
> [0]PETSC ERROR: -pc_type hypre
>
> *[0]PETSC ERROR: ----------------End of Error Message -------send entire
> error message to petsc-maint at mcs.anl.gov----------*
>
> --------------------------------------------------------------------------
>
>
>
> Best,
>
> Karthik.
>
>
>
> *From: *Stefano Zampini <stefano.zampini at gmail.com>
> *Date: *Monday, 22 November 2021 at 14:46
> *To: *Matthew Knepley <knepley at gmail.com>
> *Cc: *"Chockalingam, Karthikeyan (STFC,DL,HC)" <
> karthikeyan.chockalingam at stfc.ac.uk>, "petsc-users at mcs.anl.gov" <
> petsc-users at mcs.anl.gov>
> *Subject: *Re: [petsc-users] hypre on gpus
>
>
>
> You don't need to specify the HYPRE commit. Remove --download-hypre-commit=origin/hypre_petsc
> from the configuration options
>
>
>
> Il giorno lun 22 nov 2021 alle ore 17:29 Matthew Knepley <
> knepley at gmail.com> ha scritto:
>
> On Mon, Nov 22, 2021 at 8:50 AM Karthikeyan Chockalingam - STFC UKRI <
> karthikeyan.chockalingam at stfc.ac.uk> wrote:
>
> Hi Matt,
>
>
>
> Below is the entire error message:
>
>
>
> I cannot reproduce this:
>
>
>
> main $:/PETSc3/petsc/petsc-dev/src/ksp/ksp/tutorials$ ./ex4 -ksp_view
> -ksp_type cg -mat_type hypre -pc_type hypre
> KSP Object: 1 MPI processes
> type: cg
> maximum iterations=10000, initial guess is zero
> tolerances: relative=0.000138889, absolute=1e-50, divergence=10000.
> left preconditioning
> using PRECONDITIONED norm type for convergence test
> PC Object: 1 MPI processes
> type: hypre
> HYPRE BoomerAMG preconditioning
> Cycle type V
> Maximum number of levels 25
> Maximum number of iterations PER hypre call 1
> Convergence tolerance PER hypre call 0.
> Threshold for strong coupling 0.25
> Interpolation truncation factor 0.
> Interpolation: max elements per row 0
> Number of levels of aggressive coarsening 0
> Number of paths for aggressive coarsening 1
> Maximum row sums 0.9
> Sweeps down 1
> Sweeps up 1
> Sweeps on coarse 1
> Relax down symmetric-SOR/Jacobi
> Relax up symmetric-SOR/Jacobi
> Relax on coarse Gaussian-elimination
> Relax weight (all) 1.
> Outer relax weight (all) 1.
> Using CF-relaxation
> Not using more complex smoothers.
> Measure type local
> Coarsen type Falgout
> Interpolation type classical
> linear system matrix = precond matrix:
> Mat Object: 1 MPI processes
> type: hypre
> rows=56, cols=56
> Norm of error 8.69801e-05 iterations 2
>
> This is on the 'main' branch. So either there is some bug in release, or
> something is strange on your end. Since we run Hypre tests for the CI,
>
> I am leaning toward the latter. Can you try the 'main' branch? We will
> have to use this anyway if we want any fixes.
>
>
>
> Thanks,
>
>
>
> Matt
>
>
>
> *[0]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------*
>
> [0]PETSC ERROR: Object is in wrong state
>
> [0]PETSC ERROR: Must call MatXXXSetPreallocation(), MatSetUp() or the
> matrix has not yet been factored on argument 1 "mat" before
> MatGetOwnershipRange()
>
> [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting.
>
> [0]PETSC ERROR: Petsc Release Version 3.16.0, Sep 29, 2021
>
> [0]PETSC ERROR: ./ex4 on a named sqg2b4.bullx by kxc07-lxm25 Mon Nov 22
> 11:33:41 2021
>
> [0]PETSC ERROR: Configure options
> --prefix=/lustre/scafellpike/local/apps/gcc7/petsc/3.16.0-cuda11.2
> --with-debugging=yes
> --with-blaslapack-dir=/lustre/scafellpike/local/apps/intel/intel_cs/2018.0.128/mkl
> --with-cuda=1 --with-cuda-arch=70 --download-hypre=yes
> --download-hypre-configure-arguments=HYPRE_CUDA_SM=70
> --download-hypre-commit=origin/hypre_petsc --with-shared-libraries=1
> --known-mpi-shared-libraries=1 --with-cc=mpicc --with-cxx=mpicxx
> -with-fc=mpif90
>
> [0]PETSC ERROR: #1 MatGetOwnershipRange() at
> /netfs/smain01/scafellpike/local/package_build/build/rja87-build/petsc-cuda-3.16.0/src/mat/interface/matrix.c:6784
>
> [0]PETSC ERROR: #2 main() at ex4.c:40
>
> [0]PETSC ERROR: PETSc Option Table entries:
>
> [0]PETSC ERROR: -ksp_type cg
>
> [0]PETSC ERROR: -ksp_view
>
> [0]PETSC ERROR: -mat_type hypre
>
> [0]PETSC ERROR: -pc_type hypre
>
> [0]PETSC ERROR: -use_gpu_aware_mpi 0
>
> *[0]PETSC ERROR: ----------------End of Error Message -------send entire
> error message to petsc-maint at mcs.anl.gov----------*
>
> --------------------------------------------------------------------------
>
>
>
> I have also attached the make.log. Thank you for having a look.
>
>
>
> Best,
>
> Karthik.
>
>
>
> *From: *Matthew Knepley <knepley at gmail.com>
> *Date: *Monday, 22 November 2021 at 13:41
> *To: *"Chockalingam, Karthikeyan (STFC,DL,HC)" <
> karthikeyan.chockalingam at stfc.ac.uk>
> *Cc: *Mark Adams <mfadams at lbl.gov>, "petsc-users at mcs.anl.gov" <
> petsc-users at mcs.anl.gov>
> *Subject: *Re: [petsc-users] hypre on gpus
>
>
>
> On Mon, Nov 22, 2021 at 6:47 AM Karthikeyan Chockalingam - STFC UKRI <
> karthikeyan.chockalingam at stfc.ac.uk> wrote:
>
> Thank you for your response. I tried to run the same example
>
>
>
> petsc/src/ksp/ksp/tutorials$ *./ex4 -ksp_type cg -mat_type hypre
> -ksp_view -pc_type hypre*
>
>
>
> but it crashed with the below error
>
>
>
> *[0]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------*
>
> [0]PETSC ERROR: Object is in wrong state
>
> [0]PETSC ERROR: Must call MatXXXSetPreallocation(), MatSetUp() or the
> matrix has not yet been factored on argument 1 "mat" before
> MatGetOwnershipRange()
>
>
>
> Hi Karthik,
>
>
>
> Please do not clip the error message. We need the entire output. This
> seems strange that you would get a logic error,
>
> since that should be the same on any architecture. So could you also send
> the make.log?
>
>
>
> Thanks,
>
>
>
> Matt
>
>
>
> Below are the options used to configure hypre with cuda support. Do you
> spot any mistakes?
>
>
>
> --with-blaslapack-dir=/lustre/scafellpike/local/apps/intel/intel_cs/2018.0.128/mkl
>
>
> --with-cuda=1
>
> --with-cuda-arch=70
>
> --download-hypre=yes
>
> --download-hypre-configure-arguments=HYPRE_CUDA_SM=70
>
> --download-hypre-commit=origin/hypre_petsc
>
> --with-shared-libraries=1
>
> --known-mpi-shared-libraries=1
>
> --with-cc=mpicc
>
> --with-cxx=mpicxx
>
> -with-fc=mpif90
>
>
>
> Best,
>
> Karthik.
>
>
>
>
>
> *From: *Mark Adams <mfadams at lbl.gov>
> *Date: *Friday, 19 November 2021 at 16:31
> *To: *"Chockalingam, Karthikeyan (STFC,DL,HC)" <
> karthikeyan.chockalingam at stfc.ac.uk>
> *Cc: *"petsc-users at mcs.anl.gov" <petsc-users at mcs.anl.gov>
> *Subject: *Re: [petsc-users] hypre on gpus
>
>
>
> You should run with -options_left to check that your options are being
> used. It may be -mat_type hypre.
>
>
>
> I have tested this:
>
>
>
> petsc/src/ksp/ksp/tutorials$ *srun -n2 ./ex4 -ksp_type cg -mat_type hypre
> -ksp_view -pc_type hypre*
>
>
>
> You can add -log_view and that will print performance data for each method
> like KSPSolve.
>
>
>
> If you are configured with a GPU there will be some extra columns that
> give the percent of flops on the GPU.
>
>
>
> In the past hypre has not registered flops with us, but I do get flops
> from hypre now, and -ksp_view showed that it did indeed use hypre.
>
> I saw that the flops were 100% GPU in hypre.
>
>
>
>
>
> On Fri, Nov 19, 2021 at 10:47 AM Karthikeyan Chockalingam - STFC UKRI <
> karthikeyan.chockalingam at stfc.ac.uk> wrote:
>
> Hello,
>
>
>
> I tried to solve a 3D Poisson (ksp/tutorial/ex45) using -pc_type hypre on
> gpus
>
>
>
> ./ex45 -da_grid_x 128 -da_grid_y 128 -da_grid_z 128 -dm_mat_type hypre
> -dm_vec_type cuda -ksp_type cg -pc_type hypre -pc_hypre_type boomeramg
> -ksp_monitor -log_view
>
>
>
> I profiled the run using NSYS - attached you find all the relevant files.
>
> Looking at the profile I doubt if hypre is running on gpus. The cuda
> kernels are barely active.
>
> I don’t see any cuda kernel relevant to solve.
>
> Is my assessment correct? How can I verify if hypre indeed is running on
> gpus?
>
>
>
> Best,
>
> Karthik.
>
>
>
> *From: *Mark Adams <mfadams at lbl.gov>
> *Date: *Friday, 8 October 2021 at 18:47
> *To: *"Chockalingam, Karthikeyan (STFC,DL,HC)" <
> karthikeyan.chockalingam at stfc.ac.uk>
> *Cc: *"petsc-users at mcs.anl.gov" <petsc-users at mcs.anl.gov>
> *Subject: *Re: [petsc-users] hypre on gpus
>
>
>
> I think you would want to use 'cuda' vec_type, but I .
>
> You might ask Hypre how one verifies that the GPU is used.
>
> Mark
>
>
>
> On Fri, Oct 8, 2021 at 1:35 PM Karthikeyan Chockalingam - STFC UKRI <
> karthikeyan.chockalingam at stfc.ac.uk> wrote:
>
> Yes, I used it for both cpu and gpu. Is that not okay?
>
>
>
> For gpu: -dm_mat_type hypre -dm_vec_type mpicuda
>
>
>
> For cpu: -dm_mat_type hypre -dm_vec_type mpi
>
>
>
> *From: *Mark Adams <mfadams at lbl.gov>
> *Date: *Friday, 8 October 2021 at 18:28
> *To: *"Chockalingam, Karthikeyan (STFC,DL,HC)" <
> karthikeyan.chockalingam at stfc.ac.uk>
> *Cc: *"petsc-users at mcs.anl.gov" <petsc-users at mcs.anl.gov>
> *Subject: *Re: [petsc-users] hypre on gpus
>
>
>
> Did you use -dm_mat_type hypre on the GPU case ?
>
>
>
> On Fri, Oct 8, 2021 at 12:19 PM Karthikeyan Chockalingam - STFC UKRI <
> karthikeyan.chockalingam at stfc.ac.uk> wrote:
>
> I tried a different exercise ran the same problem on two cpu cores and on
> two gpu:
>
>
>
> On gpu
>
>
>
> PCApply 6 1.0 6.0335e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 6.0e+00 15 0 0 0 1 15 0 0 0 1 0 0 0 0.00e+00 5
> 9.65e+01 0
>
>
>
> and on cpu
>
>
>
> PCApply 6 1.0 5.6348e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00 16 0 0 0 0 16 0 0 0 0 0
>
>
>
> timings again are close but gpu version did a reduction 6.0e+00 but the
> cpu version did not 0.0e+00.
>
> I am not sure if that is any indication if hypre ran on gpus?
>
>
>
> Thanks,
>
> Karthik.
>
>
>
>
>
> *From: *Mark Adams <mfadams at lbl.gov>
> *Date: *Friday, 8 October 2021 at 16:36
> *To: *"Chockalingam, Karthikeyan (STFC,DL,HC)" <
> karthikeyan.chockalingam at stfc.ac.uk>
> *Cc: *"petsc-users at mcs.anl.gov" <petsc-users at mcs.anl.gov>
> *Subject: *Re: [petsc-users] hypre on gpus
>
>
>
>
>
>
>
> On Fri, Oct 8, 2021 at 10:29 AM Karthikeyan Chockalingam - STFC UKRI <
> karthikeyan.chockalingam at stfc.ac.uk> wrote:
>
> The PCApply timing on
>
>
>
> gpu
>
>
>
> PCApply 6 1.0 1.0235e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00 39 0 0 0 0 39 0 0 0 0 0 0 0 0.00e+00 0
> 0.00e+00 0
>
>
>
> and cpu
>
>
>
> PCApply 6 1.0 1.0242e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00 41 0 0 0 0 41 0 0 0 0 0
>
>
>
>
>
> You don't have GPUs. probably.
>
> Use -dm_mat_type hypre.
>
>
>
> are close. It is hard for me tell if hypre on gpu is on or not.
>
>
>
> Best,
>
> Karthik.
>
>
>
>
>
> *From: *"Chockalingam, Karthikeyan (STFC,DL,HC)" <
> karthikeyan.chockalingam at stfc.ac.uk>
> *Date: *Friday, 8 October 2021 at 14:55
> *To: *Mark Adams <mfadams at lbl.gov>
> *Cc: *"petsc-users at mcs.anl.gov" <petsc-users at mcs.anl.gov>
> *Subject: *Re: [petsc-users] hypre on gpus
>
>
>
> Thanks Mark, I will try your recommendations.
>
> Should I also change -dm_vec_type to hypre currently I have it as mpicuda?
>
>
>
> Karthik.
>
>
>
>
>
> *From: *Mark Adams <mfadams at lbl.gov>
> *Date: *Friday, 8 October 2021 at 14:33
> *To: *"Chockalingam, Karthikeyan (STFC,DL,HC)" <
> karthikeyan.chockalingam at stfc.ac.uk>
> *Cc: *"petsc-users at mcs.anl.gov" <petsc-users at mcs.anl.gov>
> *Subject: *Re: [petsc-users] hypre on gpus
>
>
>
> Hypre does not record its flops with PETSc's timers.
>
> Configure with and without CUDA and see if the timings change in PCApply.
>
> Hypre does not dynamically switch between CUDA and CPU solves at
> this time, but you want to use -dm_mat_type hypre.
>
> Mark
>
>
>
> On Fri, Oct 8, 2021 at 6:59 AM Karthikeyan Chockalingam - STFC UKRI <
> karthikeyan.chockalingam at stfc.ac.uk> wrote:
>
> Hello,
>
>
>
> I am trying to run ex45 (in KSP tutorial) using hypre on gpus. I have
> attached the python configuration file and -log_view output from running
> the below command options
>
>
>
> mpirun -n 2 ./ex45 -log_view -da_grid_x 169 -da_grid_y 169 -da_grid_z 169
> -dm_mat_type mpiaijcusparse -dm_vec_type mpicuda -ksp_type gmres -pc_type
> hypre -pc_hypre_type boomeramg -ksp_gmres_restart 31
> -pc_hypre_boomeramg_strong_threshold 0.7 -ksp_monitor
>
>
>
> The problem was solved and converged but from the output file I suspect
> hypre is not running on gpus as PCApply and DMCreate does *not* record
> any gpu Mflop/s. However, some events such KSPSolve, MatMult etc are
> running on gpus.
>
>
>
> Can you please let me know if I need to add any extra flag to the attached
> arch-ci-linux-cuda11-double-xx.py script file to get hypre working on gpus?
>
>
>
> Thanks,
>
> Karthik.
>
>
>
>
>
> This email and any attachments are intended solely for the use of the
> named recipients. If you are not the intended recipient you must not use,
> disclose, copy or distribute this email or any of its attachments and
> should notify the sender immediately and delete this email from your
> system. UK Research and Innovation (UKRI) has taken every reasonable
> precaution to minimise risk of this email or any attachments containing
> viruses or malware but the recipient should carry out its own virus and
> malware checks before opening the attachments. UKRI does not accept any
> liability for any losses or damages which the recipient may sustain due to
> presence of any viruses.
>
>
>
>
> --
>
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
>
>
> https://www.cse.buffalo.edu/~knepley/
> <http://www.cse.buffalo.edu/~knepley/>
>
>
>
>
> --
>
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
>
>
> https://www.cse.buffalo.edu/~knepley/
> <http://www.cse.buffalo.edu/~knepley/>
>
>
>
> --
>
> Stefano
>
>
>
>
> --
>
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
>
>
> https://www.cse.buffalo.edu/~knepley/
> <http://www.cse.buffalo.edu/~knepley/>
>
--
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20211122/b28df706/attachment-0001.html>
More information about the petsc-users
mailing list