[petsc-dev] CUDA GAMG coarse grid solver

Sun Jul 21 21:13:58 CDT 2019

> -mg_coarse_sub_mat_solver_type value: cusparse

  It is a PC factor option so it is 

  -mg_coarse_sub_pc_factor_mat_solver_type cusparse 

  -help | grep mg_coarse_sub

should have found it.

Barry

> On Jul 21, 2019, at 8:12 PM, Mark Adams <mfadams at lbl.gov> wrote:b
> 
> Barry, 
> 
> Option left: name:-mg_coarse_mat_solver_type value: cusparse
> 
> I tried this too:
> 
> Option left: name:-mg_coarse_sub_mat_solver_type value: cusparse
> 
> Here is the view. cuda did not get into the factor type.
> 
> PC Object: 24 MPI processes
>   type: gamg
>     type is MULTIPLICATIVE, levels=5 cycles=v
>       Cycles per PCApply=1
>       Using externally compute Galerkin coarse grid matrices
>       GAMG specific options
>         Threshold for dropping small values in graph on each level =   0.05   0.025   0.0125  
>         Threshold scaling factor for each level not specified = 0.5
>         AGG specific options
>           Symmetric graph false
>           Number of levels to square graph 10
>           Number smoothing steps 1
>         Complexity:    grid = 1.14213
>   Coarse grid solver -- level -------------------------------
>     KSP Object: (mg_coarse_) 24 MPI processes
>       type: preonly
>       maximum iterations=10000, initial guess is zero
>       tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
>       left preconditioning
>       using NONE norm type for convergence test
>     PC Object: (mg_coarse_) 24 MPI processes
>       type: bjacobi
>         number of blocks = 24
>         Local solve is same for all blocks, in the following KSP and PC objects:
>       KSP Object: (mg_coarse_sub_) 1 MPI processes
>         type: preonly
>         maximum iterations=1, initial guess is zero
>         tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
>         left preconditioning
>         using NONE norm type for convergence test
>       PC Object: (mg_coarse_sub_) 1 MPI processes
>         type: lu
>           out-of-place factorization
>           tolerance for zero pivot 2.22045e-14
>           using diagonal shift on blocks to prevent zero pivot [INBLOCKS]
>           matrix ordering: nd
>           factor fill ratio given 5., needed 1.
>             Factored matrix follows:
>               Mat Object: 1 MPI processes
>                 type: seqaij
>                 rows=6, cols=6
>                 package used to perform factorization: petsc
>                 total: nonzeros=36, allocated nonzeros=36
>                 total number of mallocs used during MatSetValues calls =0
>                   using I-node routines: found 2 nodes, limit used is 5
>         linear system matrix = precond matrix:
>         Mat Object: 1 MPI processes
>           type: seqaijcusparse
>           rows=6, cols=6
>           total: nonzeros=36, allocated nonzeros=36
>           total number of mallocs used during MatSetValues calls =0
>             using I-node routines: found 2 nodes, limit used is 5
>       linear system matrix = precond matrix:
>       Mat Object: 24 MPI processes
>         type: mpiaijcusparse
>         rows=6, cols=6, bs=6
>         total: nonzeros=36, allocated nonzeros=36
>         total number of mallocs used during MatSetValues calls =0
>           using scalable MatPtAP() implementation
>           using I-node (on process 0) routines: found 2 nodes, limit used is 5
>   Down solver (pre-smoother) on level 1 -------------------------------
> 
> 
> 
> On Sun, Jul 21, 2019 at 3:58 PM Mark Adams <mfadams at lbl.gov> wrote:
> Barry, I do NOT see communication. This is what made me think it was not running on the GPU. I added print statements and found that MatSolverTypeRegister_CUSPARSE IS called but (what it registers) MatGetFactor_seqaijcusparse_cusparse does NOT get called.
> 
> I have a job waiting on the queue. I'll send ksp_view when it runs. I will try -mg_coarse_mat_solver_type cusparse. That is probably the problem. Maybe I should set the coarse grid solver in a more robust way in GAMG, like use the matrix somehow? I currently use PCSetType(pc, PCLU).
> 
> I can't get an interactive shell now to run DDT, but I can try stepping through from MatGetFactor to see what its doing.
> 
> Thanks,
> Mark
> 
> On Sun, Jul 21, 2019 at 11:14 AM Smith, Barry F. <bsmith at mcs.anl.gov> wrote:
> 
> 
> > On Jul 21, 2019, at 8:55 AM, Mark Adams via petsc-dev <petsc-dev at mcs.anl.gov> wrote:
> > 
> > I am running ex56 with -ex56_dm_vec_type cuda -ex56_dm_mat_type aijcusparse and I see no GPU communication in MatSolve (the serial LU coarse grid solver).
> 
>    Do you mean to say, you DO see communication?
> 
>    What does -ksp_view should you? It should show the factor type in the information about the coarse grid solve?
> 
>    You might need something like -mg_coarse_mat_solver_type cusparse  (because it may default to the PETSc one, it may be possible to have it default to the cusparse if it exists and the matrix is of type MATSEQAIJCUSPARSE).
> 
>    The determination of the MatGetFactor() is a bit involved including pasting together strings and string compares and could be finding a CPU factorization.
> 
>    I could run on one MPI_Rank() in the debugger and put a break point in MatGetFactor() and track along to see what it picks and why. You could do this debugging without GAMG first, just -pc_type lu  
> 
> > GAMG does set the coarse grid solver to LU manually like this: ierr = PCSetType(pc2, PCLU);CHKERRQ(ierr);
> 
>   For parallel runs this won't work using the GPU code and only sequential direct solvers, so it must using BJACOBI in that case?
> 
>    Barry
> 
> 
> 
> 
> 
> > I am thinking the dispatch of the CUDA version of this got dropped somehow. 
> > 
> > I see that this is getting called:
> > 
> > PETSC_EXTERN PetscErrorCode MatSolverTypeRegister_CUSPARSE(void)
> > {
> >   PetscErrorCode ierr;
> > 
> >   PetscFunctionBegin;
> >   ierr = MatSolverTypeRegister(MATSOLVERCUSPARSE,MATSEQAIJCUSPARSE,MAT_FACTOR_LU,MatGetFactor_seqaijcusparse_cusparse);CHKERRQ(ierr);
> >   ierr = MatSolverTypeRegister(MATSOLVERCUSPARSE,MATSEQAIJCUSPARSE,MAT_FACTOR_CHOLESKY,MatGetFactor_seqaijcusparse_cusparse);CHKERRQ(ierr);
> >   ierr = MatSolverTypeRegister(MATSOLVERCUSPARSE,MATSEQAIJCUSPARSE,MAT_FACTOR_ILU,MatGetFactor_seqaijcusparse_cusparse);CHKERRQ(ierr);
> >   ierr = MatSolverTypeRegister(MATSOLVERCUSPARSE,MATSEQAIJCUSPARSE,MAT_FACTOR_ICC,MatGetFactor_seqaijcusparse_cusparse);CHKERRQ(ierr);
> >   PetscFunctionReturn(0);
> > }
> > 
> > but MatGetFactor_seqaijcusparse_cusparse is not getting  called.
> > 
> > GAMG does set the coarse grid solver to LU manually like this: ierr = PCSetType(pc2, PCLU);CHKERRQ(ierr);
> > 
> > Any ideas?
> > 
> > Thanks,
> > Mark
> > 
> > 
>