[petsc-users] What is the right way to implement a (block) Diagonal ILU as PC?
hzhang at mcs.anl.gov
hzhang at mcs.anl.gov
Tue Feb 4 13:16:04 CST 2020
Hao:
I would suggest to use a parallel sparse direct solver, e.g., superlu_dist
or mumps. These solvers can take advantage of your sparse data structure.
Once it works, then you may play with other preconditioners, such as
bjacobi + lu/ilu. See
https://www.mcs.anl.gov/petsc/miscellaneous/external.html
Hong
Dear all,
>
>
> I have a few questions about the implementation of diagonal ILU PC in
> PETSc. I want to solve a very simple system with KSP (in parallel), the
> nature of the system (finite difference time-harmonic Maxwell) is probably
> not important to the question itself. Long story short, I just need to
> solve a set of equations of Ax = b with a block diagonal system matrix,
> like (not sure if the mono font works):
>
> |X |
> A =| Y |
> | Z|
>
> Note that A is not really block-diagonal, it’s just a multi-diagonal
> matrix determined by the FD mesh, where most elements are close to
> diagonal. So instead of a full ILU decomposition, a D-ILU is a good
> approximation as a preconditioner, and the number of blocks should not
> matter too much:
>
> |Lx | |Ux |
> L = | Ly | and U = | Uy |
> | Lz| | Uz|
>
> Where [Lx, Ux] = ILU0(X), etc. Indeed, the D-ILU preconditioner (with 3N
> blocks) is quite efficient with Krylov subspace methods like BiCGstab or
> QMR in my sequential Fortran/Matlab code.
>
> So like most users, I am looking for a parallel implement with this
> problem in PETSc. After looking through the manual, it seems that the
> most straightforward way to do it is through PCBJACOBI. Not sure I
> understand it right, I just setup a 3-block PCJACOBI and give each of the
> block a KSP with PCILU. Is this supposed to be equivalent to my
> D-ILU preconditioner? My little block of fortran code would look like:
> ...
> * call* PCBJacobiSetTotalBlocks(pc_local,Ntotal, &
> & isubs,ierr)
> *call* PCBJacobiSetLocalBlocks(pc_local, Nsub, &
> & isubs(istart:iend),ierr)
> ! set up the block jacobi structure
> *call* KSPSetup(ksp_local,ierr)
> ! allocate sub ksps
> allocate(ksp_sub(Nsub))
> *call* PCBJacobiGetSubKSP(pc_local,Nsub,istart, &
> & ksp_sub,ierr)
> do i=1,Nsub
> *call* KSPGetPC(ksp_sub(i),pc_sub,ierr)
> !ILU preconditioner
> *call* PCSetType(pc_sub,ptype,ierr)
> *call* PCFactorSetLevels(pc_sub,1,ierr) ! use ILU(1) here
> *call* KSPSetType(ksp_sub(i),KSPPREONLY,ierr)]
> end do
> *call* KSPSetTolerances(ksp_local,KSSiter%tol,PETSC_DEFAULT_REAL, &
> & PETSC_DEFAULT_REAL,KSSiter%maxit,ierr)
> …
>
> I understand that the parallel performance may not be comparable, so I
> first set up a one-process test (with MPIAij, but all the rows are local
> since there is only one process). The system is solved without any
> problem (identical results within error). But the performance is actually a
> lot worse (code built without debugging flags in performance tests) than my
> own home-brew implementation in Fortran (I wrote my own ILU0 in CSR sparse
> matrix format), which is hard to believe. I suspect the difference is from
> the PC as the PETSc version took much more BiCGstab iterations (60-ish vs
> 100-ish) to converge to the same relative tol.
>
> This is further confirmed when I change the setup of D-ILU (using 6 or 9
> blocks instead of 3). While my Fortran/Matlab codes see minimal performance
> difference (<5%) when I play with the D-ILU setup, increasing the number of
> D-ILU blocks from 3 to 9 caused the ksp setup with PCBJACOBI to suffer a
> performance decrease of more than 50% in sequential test. So my
> implementation IS somewhat different in PETSc. Do I miss something in the
> PCBJACOBI setup? Or do I have some fundamental misunderstanding of how
> PCBJACOBI works in PETSc?
>
> If this is not the right way to implement a block diagonal ILU as
> (parallel) PC, please kindly point me to the right direction. I searched
> through the mail list to find some answers, only to find a couple of
> similar questions... An example would be nice.
>
> On the other hand, does PETSc support a simple way to use explicit L/U
> matrix as a preconditioner? I can import the D-ILU matrix (I already
> converted my A matrix into Mat) constructed in my Fortran code to make a
> better comparison. Or do I have to construct my own PC using PCshell? If
> so, is there a good tutorial/example to learn about how to use PCSHELL (in
> a more advanced sense, like how to setup pc side and transpose)?
>
> Thanks in advance,
>
> Hao
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20200204/a385a12f/attachment.html>
More information about the petsc-users
mailing list