[petsc-users] CG with right preconditioning supports NONE norm type only

Patrick Sanan patrick.sanan at gmail.com
Sat Mar 18 16:09:29 CDT 2017


There is indeed a way to interpret preconditioned CG as left preconditioning:

Consider

    Ax = b

where A is spd.

You can write down the equivalent left-preconditioned system

   M^{-1}Ax = M^{-1}b

but of course this doesn't work directly with CG because M^{-1}A is no
longer spd in general.

However, if M^{-1} is spd, so is M and you can use it to define an
inner product. In this inner product, M{-1}A *is* spd. Here assume
real scalars for simplicity:

  <M^{-1}Au,v>_M = <v,M^{-1}Au>_M = <v,Au> = <Av,u> = <u,Av> = <u,M^{-1}A>_M

So, you can write use CG, using this inner product everywhere. You
don't in fact ever need to apply M, just M^{-1}, and you arrive at
standard preconditioned CG (See these notes [1]).

This is what's in PETSc as left-preconditioned CG, the only option.

This is a bit different from the way left preconditioning works for
(say) GMRES. With GMRES you could assemble A' = M^{-1}A and b' =
M^{-1}b and pass A' and b' to black-box (unpreconditioned) GMRES. With
CG, you fundamentally need to provide both M^{-1} and A (and
fortunately PETSc lets you do exactly this).

This then invites the natural question about applying the same logic
to the right preconditioned system. I saw this as an exercise in the
same course notes as above, from J. M. Melenk at TU Wien [1] .

Consider the right-preconditioned system

    AM^{-1}y = b,    x = M^{-1}y .

You can note that AM^-1 is spd in the M^{-1} inner product, and again
write down CG for the preconditioned system, now using M^{-1} inner
products everywhere.

You can then (as in this photo which I'm too lazy to transcribe [2]),
introduce transformed variables z = M^{-1}r, q = M^{-1}p, x = M^{-1}y
and work things out and arrive at ... standard preconditioned CG!

Thus the conclusion is that there really is only one way to
precondition CG. You can derive it as
1. left-preconditioned CG in the M norm, or
2. right-preconditioned CG in the M^{-1} norm, or
3. symmetrically-preconditioned CG (with M^{-1/2} which you don't need
to explicitly form), or
4. standard CG with a new inner product.

The last option is, I think, the most satisfying and fundamental. The
recent Málek and Strakoš book [3] makes a beautiful case as to why,
and indeed Hestenes himself first wrote about this in 1956, four years
after he and Stiefel popularized CG and before the term
"preconditioning" even existed [4].

So, in terms of concrete things to do with PETSc, I stand by my
feeling that calling CG preconditioning "left preconditioning" is
somehow misleading.

Rather, I'd suggest calling the relevant value of PCSide something
like PC_SPD . This
i) reminds the user of the fact that the preconditioner has to be spd,
ii) removes the asymmetry of calling it a left preconditioner when you
could equally-well call it a right preconditioner, and
iii) highlights the fact that preconditioning is different with CG
than with methods like GMRES.

PC_LEFT could be deprecated for CG before removing support.

If this seems reasonable, I can make a PR with these changes and
updates to the man page (and should also think about what other KSPs
this applies to - probably KSPMINRES and others).

[1] http://www.asc.tuwien.ac.at/~melenk/teach/iterative/
[2] http://patricksanan.com/rpcg.jpg
[3] http://bookstore.siam.org/sl01/
[4] https://scholar.google.ch/scholar?q=M.+Hestenes+1956.+The+Conjugate+Gradient+Method+for+Solving+Linear+Systems



On Wed, Mar 8, 2017 at 6:57 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>
>   Patrick,
>
>     Thanks, this is interesting, we should try to add material to the KSPCG page to capture some of the subtleties that I admit after 28 years I still don't really understand. The paper A TAXONOMY FOR CONJUGATE GRADIENT METHODS (attached) has some interesting discussion (particularly page 1548 "Therefore, only left preconditioning need be considered: Right preconditioning may be effected by incorporating it into the left preconditioner and inner product." I don't know exactly what this means in practical terms in respect to code. (In PETSc KSP we don't explicitly talk about, or use a separate "inner product"  in the Krylov methods, we only have the concept of operator and preconditioner operator.)
>
>   Remembering vaguely, perhaps incorrectly, from a real long time ago "left preconditioning with a spd M is just the unpreconditioned cg in the M inner product" while "right preconditioning with M is unpreconditioned cg in a M^-1 inner product". If this is correct then it implies (I think) that right preconditioned CG would produce a different set of iterates than left preconditioning and hence is "missing" in terms of completeness from PETSc KSP.
>
>   Barry
>
>
>
>
>> On Mar 8, 2017, at 7:37 PM, Patrick Sanan <patrick.sanan at gmail.com> wrote:
>>
>> On Wed, Mar 8, 2017 at 11:12 AM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>>>
>>>> On Mar 8, 2017, at 10:47 AM, Kong, Fande <fande.kong at inl.gov> wrote:
>>>>
>>>> Thanks Barry,
>>>>
>>>> We are using "KSPSetPCSide(ksp, pcside)" in the code.  I just tried "-ksp_pc_side right", and petsc did not error out.
>>>>
>>>> I like to understand why CG does not work with right preconditioning? Mathematically, the right preconditioning does not make sense?
>>>
>>>   No, mathematically it makes sense to do it on the right. It is just that the PETSc code was never written to support it on the right. One reason is that CG is interesting that you can run with the true residual or the preconditioned residual with left preconditioning, hence less incentive to ever bother writing it to support right preconditioning. For completeness we should support right as well as symmetric.
>>
>> For standard CG preconditioning, which PETSc calls left
>> preconditioning, you use a s.p.d. preconditioner M to define an inner
>> product in the algorithm, and end up finding iterates x_k in K_k(MA;
>> Mb). That isn't quite the same as left-preconditioned GMRES, where you
>> apply standard GMRES to the equivalent system MAx=Mb, and also end up
>> finding iterates in K_k(MA,Mb). This wouldn't work for CG because MA
>> isn't s.p.d. in general, even if M and A are.
>>
>> Standard CG preconditioning is often motivated as a clever way to do
>> symmetric preconditioning, E^TAEy = E^Tb, x=Ey, without ever needing E
>> explicitly, using only M=EE^T . y_k lives in K_k(E^TAE,E^Tb) and thus
>> x_k again lives in K_k(MA;Mb).
>>
>> Thus, it's not clear that there is an candidate for a
>> right-preconditioned CG variant, as what PETSc calls "left"
>> preconditioning doesn't arise in the same way that it does for other
>> Krylov methods, namely using the standard algorithm on MAx=Mb. For
>> GMRES you would get a right-preconditioned variant by looking at the
>> transformed system AMy=b, x = My. This means that y_k lives in
>> K_k(AM,b), so x lives in K_k(MA,Mb), as before. For CG, AM wouldn't be
>> spd in general so this approach wouldn't make sense.
>>
>> Another way to look at the difference in "left" preconditioning
>> between GMRES and CG is that
>>
>> - introducing left preconditioning for GMRES alters both the Krylov
>> subspaces *and* the optimality condition: you go from minimizing || b
>> - Ax_k ||_2 over K_k(A;b) to minimizing || M (b-Ax_k) ||_2 over
>> K_k(MA;Mb).
>>
>> - introducing "left" preconditioning for CG alters *only* the Krylov
>> subspaces: you always minimize || x - x_k ||_A , but change the space
>> from K_k(A;b) to K_k(MA;Mb).
>>
>> Thus, my impression is that it's misleading to call standard CG
>> preconditioning "left" preconditioning in PETSc - someone might think
>> of GMRES and naturally ask why there is no right preconditioning.
>>
>> One might define a new entry in PCSide to be used with CG and friends.
>> I can't think of any slam dunk suggestions yet, but something in the
>> genre of PC_INNERPRODUCT, PC_METRIC, PC_CG, or PC_IMPLICITSYMMETRIC,
>> perhaps.
>>
>>
>>>
>>>  Barry
>>>
>>>>
>>>> Fande,
>>>>
>>>> On Wed, Mar 8, 2017 at 9:33 AM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>>>>
>>>>  Please tell us how you got this output.
>>>>
>>>>  PETSc CG doesn't even implement right preconditioning. If you ask for it it should error out. CG supports no norm computation with left preconditioning.
>>>>
>>>>   Barry
>>>>
>>>>> On Mar 8, 2017, at 10:26 AM, Kong, Fande <fande.kong at inl.gov> wrote:
>>>>>
>>>>> Hi All,
>>>>>
>>>>> The NONE norm type is supported only when CG is used with a right preconditioner. Any reason for this?
>>>>>
>>>>>
>>>>>
>>>>> 0 Nonlinear |R| = 1.732051e+00
>>>>>      0 Linear |R| = 0.000000e+00
>>>>>      1 Linear |R| = 0.000000e+00
>>>>>      2 Linear |R| = 0.000000e+00
>>>>>      3 Linear |R| = 0.000000e+00
>>>>>      4 Linear |R| = 0.000000e+00
>>>>>      5 Linear |R| = 0.000000e+00
>>>>>      6 Linear |R| = 0.000000e+00
>>>>> 1 Nonlinear |R| = 1.769225e-08
>>>>>      0 Linear |R| = 0.000000e+00
>>>>>      1 Linear |R| = 0.000000e+00
>>>>>      2 Linear |R| = 0.000000e+00
>>>>>      3 Linear |R| = 0.000000e+00
>>>>>      4 Linear |R| = 0.000000e+00
>>>>>      5 Linear |R| = 0.000000e+00
>>>>>      6 Linear |R| = 0.000000e+00
>>>>>      7 Linear |R| = 0.000000e+00
>>>>>      8 Linear |R| = 0.000000e+00
>>>>>      9 Linear |R| = 0.000000e+00
>>>>>     10 Linear |R| = 0.000000e+00
>>>>> 2 Nonlinear |R| = 0.000000e+00
>>>>> SNES Object: 1 MPI processes
>>>>>  type: newtonls
>>>>>  maximum iterations=50, maximum function evaluations=10000
>>>>>  tolerances: relative=1e-08, absolute=1e-50, solution=1e-50
>>>>>  total number of linear solver iterations=18
>>>>>  total number of function evaluations=23
>>>>>  norm schedule ALWAYS
>>>>>  SNESLineSearch Object:   1 MPI processes
>>>>>    type: bt
>>>>>      interpolation: cubic
>>>>>      alpha=1.000000e-04
>>>>>    maxstep=1.000000e+08, minlambda=1.000000e-12
>>>>>    tolerances: relative=1.000000e-08, absolute=1.000000e-15, lambda=1.000000e-08
>>>>>    maximum iterations=40
>>>>>  KSP Object:   1 MPI processes
>>>>>    type: cg
>>>>>    maximum iterations=10000, initial guess is zero
>>>>>    tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
>>>>>    right preconditioning
>>>>>    using NONE norm type for convergence test
>>>>>  PC Object:   1 MPI processes
>>>>>    type: hypre
>>>>>      HYPRE BoomerAMG preconditioning
>>>>>      HYPRE BoomerAMG: Cycle type V
>>>>>      HYPRE BoomerAMG: Maximum number of levels 25
>>>>>      HYPRE BoomerAMG: Maximum number of iterations PER hypre call 1
>>>>>      HYPRE BoomerAMG: Convergence tolerance PER hypre call 0.
>>>>>      HYPRE BoomerAMG: Threshold for strong coupling 0.25
>>>>>      HYPRE BoomerAMG: Interpolation truncation factor 0.
>>>>>      HYPRE BoomerAMG: Interpolation: max elements per row 0
>>>>>      HYPRE BoomerAMG: Number of levels of aggressive coarsening 0
>>>>>      HYPRE BoomerAMG: Number of paths for aggressive coarsening 1
>>>>>      HYPRE BoomerAMG: Maximum row sums 0.9
>>>>>      HYPRE BoomerAMG: Sweeps down         1
>>>>>      HYPRE BoomerAMG: Sweeps up           1
>>>>>      HYPRE BoomerAMG: Sweeps on coarse    1
>>>>>      HYPRE BoomerAMG: Relax down          symmetric-SOR/Jacobi
>>>>>      HYPRE BoomerAMG: Relax up            symmetric-SOR/Jacobi
>>>>>      HYPRE BoomerAMG: Relax on coarse     Gaussian-elimination
>>>>>      HYPRE BoomerAMG: Relax weight  (all)      1.
>>>>>      HYPRE BoomerAMG: Outer relax weight (all) 1.
>>>>>      HYPRE BoomerAMG: Using CF-relaxation
>>>>>      HYPRE BoomerAMG: Not using more complex smoothers.
>>>>>      HYPRE BoomerAMG: Measure type        local
>>>>>      HYPRE BoomerAMG: Coarsen type        Falgout
>>>>>      HYPRE BoomerAMG: Interpolation type  classical
>>>>>    linear system matrix followed by preconditioner matrix:
>>>>>    Mat Object:     1 MPI processes
>>>>>      type: mffd
>>>>>      rows=9, cols=9
>>>>>        Matrix-free approximation:
>>>>>          err=1.49012e-08 (relative error in function evaluation)
>>>>>          Using wp compute h routine
>>>>>              Does not compute normU
>>>>>    Mat Object:    ()     1 MPI processes
>>>>>      type: seqaij
>>>>>      rows=9, cols=9
>>>>>      total: nonzeros=49, allocated nonzeros=49
>>>>>      total number of mallocs used during MatSetValues calls =0
>>>>>        not using I-node routines
>>>>>
>>>>> Fande,
>
>

On Thu, Mar 9, 2017 at 3:57 AM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>
>   Patrick,
>
>     Thanks, this is interesting, we should try to add material to the KSPCG page to capture some of the subtleties that I admit after 28 years I still don't really understand. The paper A TAXONOMY FOR CONJUGATE GRADIENT METHODS (attached) has some interesting discussion (particularly page 1548 "Therefore, only left preconditioning need be considered: Right preconditioning may be effected by incorporating it into the left preconditioner and inner product." I don't know exactly what this means in practical terms in respect to code. (In PETSc KSP we don't explicitly talk about, or use a separate "inner product"  in the Krylov methods, we only have the concept of operator and preconditioner operator.)
>
>   Remembering vaguely, perhaps incorrectly, from a real long time ago "left preconditioning with a spd M is just the unpreconditioned cg in the M inner product" while "right preconditioning with M is unpreconditioned cg in a M^-1 inner product". If this is correct then it implies (I think) that right preconditioned CG would produce a different set of iterates than left preconditioning and hence is "missing" in terms of completeness from PETSc KSP.
>
>   Barry
>
>
>
>
>> On Mar 8, 2017, at 7:37 PM, Patrick Sanan <patrick.sanan at gmail.com> wrote:
>>
>> On Wed, Mar 8, 2017 at 11:12 AM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>>>
>>>> On Mar 8, 2017, at 10:47 AM, Kong, Fande <fande.kong at inl.gov> wrote:
>>>>
>>>> Thanks Barry,
>>>>
>>>> We are using "KSPSetPCSide(ksp, pcside)" in the code.  I just tried "-ksp_pc_side right", and petsc did not error out.
>>>>
>>>> I like to understand why CG does not work with right preconditioning? Mathematically, the right preconditioning does not make sense?
>>>
>>>   No, mathematically it makes sense to do it on the right. It is just that the PETSc code was never written to support it on the right. One reason is that CG is interesting that you can run with the true residual or the preconditioned residual with left preconditioning, hence less incentive to ever bother writing it to support right preconditioning. For completeness we should support right as well as symmetric.
>>
>> For standard CG preconditioning, which PETSc calls left
>> preconditioning, you use a s.p.d. preconditioner M to define an inner
>> product in the algorithm, and end up finding iterates x_k in K_k(MA;
>> Mb). That isn't quite the same as left-preconditioned GMRES, where you
>> apply standard GMRES to the equivalent system MAx=Mb, and also end up
>> finding iterates in K_k(MA,Mb). This wouldn't work for CG because MA
>> isn't s.p.d. in general, even if M and A are.
>>
>> Standard CG preconditioning is often motivated as a clever way to do
>> symmetric preconditioning, E^TAEy = E^Tb, x=Ey, without ever needing E
>> explicitly, using only M=EE^T . y_k lives in K_k(E^TAE,E^Tb) and thus
>> x_k again lives in K_k(MA;Mb).
>>
>> Thus, it's not clear that there is an candidate for a
>> right-preconditioned CG variant, as what PETSc calls "left"
>> preconditioning doesn't arise in the same way that it does for other
>> Krylov methods, namely using the standard algorithm on MAx=Mb. For
>> GMRES you would get a right-preconditioned variant by looking at the
>> transformed system AMy=b, x = My. This means that y_k lives in
>> K_k(AM,b), so x lives in K_k(MA,Mb), as before. For CG, AM wouldn't be
>> spd in general so this approach wouldn't make sense.
>>
>> Another way to look at the difference in "left" preconditioning
>> between GMRES and CG is that
>>
>> - introducing left preconditioning for GMRES alters both the Krylov
>> subspaces *and* the optimality condition: you go from minimizing || b
>> - Ax_k ||_2 over K_k(A;b) to minimizing || M (b-Ax_k) ||_2 over
>> K_k(MA;Mb).
>>
>> - introducing "left" preconditioning for CG alters *only* the Krylov
>> subspaces: you always minimize || x - x_k ||_A , but change the space
>> from K_k(A;b) to K_k(MA;Mb).
>>
>> Thus, my impression is that it's misleading to call standard CG
>> preconditioning "left" preconditioning in PETSc - someone might think
>> of GMRES and naturally ask why there is no right preconditioning.
>>
>> One might define a new entry in PCSide to be used with CG and friends.
>> I can't think of any slam dunk suggestions yet, but something in the
>> genre of PC_INNERPRODUCT, PC_METRIC, PC_CG, or PC_IMPLICITSYMMETRIC,
>> perhaps.
>>
>>
>>>
>>>  Barry
>>>
>>>>
>>>> Fande,
>>>>
>>>> On Wed, Mar 8, 2017 at 9:33 AM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>>>>
>>>>  Please tell us how you got this output.
>>>>
>>>>  PETSc CG doesn't even implement right preconditioning. If you ask for it it should error out. CG supports no norm computation with left preconditioning.
>>>>
>>>>   Barry
>>>>
>>>>> On Mar 8, 2017, at 10:26 AM, Kong, Fande <fande.kong at inl.gov> wrote:
>>>>>
>>>>> Hi All,
>>>>>
>>>>> The NONE norm type is supported only when CG is used with a right preconditioner. Any reason for this?
>>>>>
>>>>>
>>>>>
>>>>> 0 Nonlinear |R| = 1.732051e+00
>>>>>      0 Linear |R| = 0.000000e+00
>>>>>      1 Linear |R| = 0.000000e+00
>>>>>      2 Linear |R| = 0.000000e+00
>>>>>      3 Linear |R| = 0.000000e+00
>>>>>      4 Linear |R| = 0.000000e+00
>>>>>      5 Linear |R| = 0.000000e+00
>>>>>      6 Linear |R| = 0.000000e+00
>>>>> 1 Nonlinear |R| = 1.769225e-08
>>>>>      0 Linear |R| = 0.000000e+00
>>>>>      1 Linear |R| = 0.000000e+00
>>>>>      2 Linear |R| = 0.000000e+00
>>>>>      3 Linear |R| = 0.000000e+00
>>>>>      4 Linear |R| = 0.000000e+00
>>>>>      5 Linear |R| = 0.000000e+00
>>>>>      6 Linear |R| = 0.000000e+00
>>>>>      7 Linear |R| = 0.000000e+00
>>>>>      8 Linear |R| = 0.000000e+00
>>>>>      9 Linear |R| = 0.000000e+00
>>>>>     10 Linear |R| = 0.000000e+00
>>>>> 2 Nonlinear |R| = 0.000000e+00
>>>>> SNES Object: 1 MPI processes
>>>>>  type: newtonls
>>>>>  maximum iterations=50, maximum function evaluations=10000
>>>>>  tolerances: relative=1e-08, absolute=1e-50, solution=1e-50
>>>>>  total number of linear solver iterations=18
>>>>>  total number of function evaluations=23
>>>>>  norm schedule ALWAYS
>>>>>  SNESLineSearch Object:   1 MPI processes
>>>>>    type: bt
>>>>>      interpolation: cubic
>>>>>      alpha=1.000000e-04
>>>>>    maxstep=1.000000e+08, minlambda=1.000000e-12
>>>>>    tolerances: relative=1.000000e-08, absolute=1.000000e-15, lambda=1.000000e-08
>>>>>    maximum iterations=40
>>>>>  KSP Object:   1 MPI processes
>>>>>    type: cg
>>>>>    maximum iterations=10000, initial guess is zero
>>>>>    tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
>>>>>    right preconditioning
>>>>>    using NONE norm type for convergence test
>>>>>  PC Object:   1 MPI processes
>>>>>    type: hypre
>>>>>      HYPRE BoomerAMG preconditioning
>>>>>      HYPRE BoomerAMG: Cycle type V
>>>>>      HYPRE BoomerAMG: Maximum number of levels 25
>>>>>      HYPRE BoomerAMG: Maximum number of iterations PER hypre call 1
>>>>>      HYPRE BoomerAMG: Convergence tolerance PER hypre call 0.
>>>>>      HYPRE BoomerAMG: Threshold for strong coupling 0.25
>>>>>      HYPRE BoomerAMG: Interpolation truncation factor 0.
>>>>>      HYPRE BoomerAMG: Interpolation: max elements per row 0
>>>>>      HYPRE BoomerAMG: Number of levels of aggressive coarsening 0
>>>>>      HYPRE BoomerAMG: Number of paths for aggressive coarsening 1
>>>>>      HYPRE BoomerAMG: Maximum row sums 0.9
>>>>>      HYPRE BoomerAMG: Sweeps down         1
>>>>>      HYPRE BoomerAMG: Sweeps up           1
>>>>>      HYPRE BoomerAMG: Sweeps on coarse    1
>>>>>      HYPRE BoomerAMG: Relax down          symmetric-SOR/Jacobi
>>>>>      HYPRE BoomerAMG: Relax up            symmetric-SOR/Jacobi
>>>>>      HYPRE BoomerAMG: Relax on coarse     Gaussian-elimination
>>>>>      HYPRE BoomerAMG: Relax weight  (all)      1.
>>>>>      HYPRE BoomerAMG: Outer relax weight (all) 1.
>>>>>      HYPRE BoomerAMG: Using CF-relaxation
>>>>>      HYPRE BoomerAMG: Not using more complex smoothers.
>>>>>      HYPRE BoomerAMG: Measure type        local
>>>>>      HYPRE BoomerAMG: Coarsen type        Falgout
>>>>>      HYPRE BoomerAMG: Interpolation type  classical
>>>>>    linear system matrix followed by preconditioner matrix:
>>>>>    Mat Object:     1 MPI processes
>>>>>      type: mffd
>>>>>      rows=9, cols=9
>>>>>        Matrix-free approximation:
>>>>>          err=1.49012e-08 (relative error in function evaluation)
>>>>>          Using wp compute h routine
>>>>>              Does not compute normU
>>>>>    Mat Object:    ()     1 MPI processes
>>>>>      type: seqaij
>>>>>      rows=9, cols=9
>>>>>      total: nonzeros=49, allocated nonzeros=49
>>>>>      total number of mallocs used during MatSetValues calls =0
>>>>>        not using I-node routines
>>>>>
>>>>> Fande,
>
>


More information about the petsc-users mailing list