[petsc-users] 'Preconditioning' with lower-order method

Sun Mar 3 23:48:12 CST 2024

From: Jed Brown <jed at jedbrown.org>
Date: Sunday, March 3, 2024 at 11:35 PM
To: Zou, Ling <lzou at anl.gov>, Barry Smith <bsmith at petsc.dev>
Cc: petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>
Subject: Re: [petsc-users] 'Preconditioning' with lower-order method
If you're having PETSc use coloring and have confirmed that the stencil is sufficient, then it would be nonsmoothness (again, consider the limiter you've chosen) preventing quadratic convergence (assuming that doesn't kick in eventually). Note
ZjQcmQRYFpfptBannerStart
This Message Is From an External Sender
This message came from outside your organization.

ZjQcmQRYFpfptBannerEnd

If you're having PETSc use coloring and have confirmed that the stencil is sufficient, then it would be nonsmoothness (again, consider the limiter you've chosen) preventing quadratic convergence (assuming that doesn't kick in eventually).

·         Yes, I do use coloring, and I do provide sufficient stencil, i.e., neighbor’s neighbor. The sufficiency is confirmed by PETSc’s -snes_test_jacobian and -snes_test_jacobian_view options.

Note that assembling a Jacobian of a second order TVD operator requires at least second neighbors while the first order needs only first neighbors, thus is much sparser and needs fewer colors to compute.

·         In my code implementation, when marking the Jacobian nonzero pattern, I don’t differentiate FV1 or FV2, I always use the FV2 stencil, so it’s a bit ‘fat’ for the FV1 method, but worked just fine.

I expect you're either not exploiting that in the timings or something else is amiss. You can run with `-log_view -snes_view -ksp_converged_reason` to get a bit more information about what's happening.

·         The attached is screen output as you suggest. The linear and nonlinear performance of FV2 is both worse from the output.

FV2:

Time Step 149, time = 13229.7, dt = 100

    NL Step =  0, fnorm =  7.80968E-03

  Linear solve converged due to CONVERGED_RTOL iterations 26

    NL Step =  1, fnorm =  7.65731E-03

  Linear solve converged due to CONVERGED_RTOL iterations 24

    NL Step =  2, fnorm =  6.85034E-03

  Linear solve converged due to CONVERGED_RTOL iterations 27

    NL Step =  3, fnorm =  6.11873E-03

  Linear solve converged due to CONVERGED_RTOL iterations 25

    NL Step =  4, fnorm =  1.57347E-03

  Linear solve converged due to CONVERGED_RTOL iterations 27

    NL Step =  5, fnorm =  9.03536E-04

SNES Object: 1 MPI process

  type: newtonls

  maximum iterations=20, maximum function evaluations=10000

  tolerances: relative=1e-08, absolute=1e-06, solution=1e-08

  total number of linear solver iterations=129

  total number of function evaluations=144

  norm schedule ALWAYS

  Jacobian is applied matrix-free with differencing

  Preconditioning Jacobian is built using finite differences with coloring

  SNESLineSearch Object: 1 MPI process

    type: bt

      interpolation: cubic

      alpha=1.000000e-04

    maxstep=1.000000e+08, minlambda=1.000000e-12

    tolerances: relative=1.000000e-08, absolute=1.000000e-15, lambda=1.000000e-08

    maximum iterations=40

  KSP Object: 1 MPI process

    type: gmres

      restart=100, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement

      happy breakdown tolerance 1e-30

    maximum iterations=100, initial guess is zero

    tolerances:  relative=0.0001, absolute=1e-50, divergence=10000.

    left preconditioning

    using PRECONDITIONED norm type for convergence test

  PC Object: 1 MPI process

    type: ilu

      out-of-place factorization

      0 levels of fill

      tolerance for zero pivot 2.22045e-14

      using diagonal shift to prevent zero pivot [NONZERO]

      matrix ordering: rcm

      factor fill ratio given 1., needed 1.

        Factored matrix follows:

          Mat Object: 1 MPI process

            type: seqaij

            rows=8715, cols=8715

            package used to perform factorization: petsc

            total: nonzeros=38485, allocated nonzeros=38485

              not using I-node routines

    linear system matrix followed by preconditioner matrix:

    Mat Object: 1 MPI process

      type: mffd

      rows=8715, cols=8715

        Matrix-free approximation:

          err=1.49012e-08 (relative error in function evaluation)

          Using wp compute h routine

              Does not compute normU

    Mat Object: 1 MPI process

      type: seqaij

      rows=8715, cols=8715

      total: nonzeros=38485, allocated nonzeros=38485

      total number of mallocs used during MatSetValues calls=0

        not using I-node routines

 Solve Converged!

FV1:

Time Step 149, time = 13229.7, dt = 100

    NL Step =  0, fnorm =  7.90072E-03

  Linear solve converged due to CONVERGED_RTOL iterations 12

    NL Step =  1, fnorm =  2.01919E-04

  Linear solve converged due to CONVERGED_RTOL iterations 17

    NL Step =  2, fnorm =  1.06960E-05

  Linear solve converged due to CONVERGED_RTOL iterations 15

    NL Step =  3, fnorm =  2.41683E-09

SNES Object: 1 MPI process

  type: newtonls

  maximum iterations=20, maximum function evaluations=10000

  tolerances: relative=1e-08, absolute=1e-06, solution=1e-08

  total number of linear solver iterations=44

  total number of function evaluations=51

  norm schedule ALWAYS

  Jacobian is applied matrix-free with differencing

  Preconditioning Jacobian is built using finite differences with coloring

  SNESLineSearch Object: 1 MPI process

    type: bt

      interpolation: cubic

      alpha=1.000000e-04

    maxstep=1.000000e+08, minlambda=1.000000e-12

    tolerances: relative=1.000000e-08, absolute=1.000000e-15, lambda=1.000000e-08

    maximum iterations=40

  KSP Object: 1 MPI process

    type: gmres

      restart=100, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement

      happy breakdown tolerance 1e-30

    maximum iterations=100, initial guess is zero

    tolerances:  relative=0.0001, absolute=1e-50, divergence=10000.

    left preconditioning

    using PRECONDITIONED norm type for convergence test

  PC Object: 1 MPI process

    type: ilu

      out-of-place factorization

      0 levels of fill

      tolerance for zero pivot 2.22045e-14

      using diagonal shift to prevent zero pivot [NONZERO]

      matrix ordering: rcm

      factor fill ratio given 1., needed 1.

        Factored matrix follows:

          Mat Object: 1 MPI process

            type: seqaij

            rows=8715, cols=8715

            package used to perform factorization: petsc

            total: nonzeros=38485, allocated nonzeros=38485

              not using I-node routines

    linear system matrix followed by preconditioner matrix:

    Mat Object: 1 MPI process

      type: mffd

      rows=8715, cols=8715

        Matrix-free approximation:

          err=1.49012e-08 (relative error in function evaluation)

          Using wp compute h routine

              Does not compute normU

    Mat Object: 1 MPI process

      type: seqaij

      rows=8715, cols=8715

      total: nonzeros=38485, allocated nonzeros=38485

      total number of mallocs used during MatSetValues calls=0

        not using I-node routines

 Solve Converged!

"Zou, Ling via petsc-users" <petsc-users at mcs.anl.gov> writes:

> Barry, thank you.

> I am not sure if I exactly follow you on this:

> “Are you forming the Jacobian for the first and second order cases inside of Newton?”

>

> The problem that we deal with, heat/mass transfer in heterogeneous systems (reactor system), is generally small in terms of size, i.e., # of DOFs (several k to maybe 100k level), so for now, I completely rely on PETSc to compute Jacobian, i.e., finite-differencing.

>

> That’s a good suggestion to see the time spent during various events.

> What motivated me to try the options are the following observations.

>

> 2nd order FVM:

>

> Time Step 149, time = 13229.7, dt = 100

>

>     NL Step =  0, fnorm =  7.80968E-03

>

>     NL Step =  1, fnorm =  7.65731E-03

>

>     NL Step =  2, fnorm =  6.85034E-03

>

>     NL Step =  3, fnorm =  6.11873E-03

>

>     NL Step =  4, fnorm =  1.57347E-03

>

>     NL Step =  5, fnorm =  9.03536E-04

>

>  Solve Converged!

>

> 1st order FVM:

>

> Time Step 149, time = 13229.7, dt = 100

>

>     NL Step =  0, fnorm =  7.90072E-03

>

>     NL Step =  1, fnorm =  2.01919E-04

>

>     NL Step =  2, fnorm =  1.06960E-05

>

>     NL Step =  3, fnorm =  2.41683E-09

>

>  Solve Converged!

>

> Notice the obvious ‘stagnant’ in residual for the 2nd order method while not in the 1st order.

> For the same problem, the wall time is 10 sec vs 6 sec. I would be happy if I can reduce 2 sec for the 2nd order method.

>

> -Ling

>

> From: Barry Smith <bsmith at petsc.dev>

> Date: Sunday, March 3, 2024 at 12:06 PM

> To: Zou, Ling <lzou at anl.gov>

> Cc: petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>

> Subject: Re: [petsc-users] 'Preconditioning' with lower-order method

> Are you forming the Jacobian for the first and second order cases inside of Newton? You can run both with -log_view to see how much time is spent in the various events (compute function, compute Jacobian, linear solve, .. . ) for the two cases

> ZjQcmQRYFpfptBannerStart

> This Message Is From an External Sender

> This message came from outside your organization.

>

> ZjQcmQRYFpfptBannerEnd

>

>    Are you forming the Jacobian for the first and second order cases inside of Newton?

>

>    You can run both with -log_view to see how much time is spent in the various events (compute function, compute Jacobian, linear solve, ...) for the two cases and compare them.

>

>

>

>

> On Mar 3, 2024, at 11:42 AM, Zou, Ling via petsc-users <petsc-users at mcs.anl.gov> wrote:

>

> Original email may have been sent to the incorrect place.

> See below.

>

> -Ling

>

> From: Zou, Ling <lzou at anl.gov<mailto:lzou at anl.gov>>

> Date: Sunday, March 3, 2024 at 10:34 AM

> To: petsc-users <petsc-users-bounces at mcs.anl.gov<mailto:petsc-users-bounces at mcs.anl.gov>>

> Subject: 'Preconditioning' with lower-order method

> Hi all,

>

> I am solving a PDE system over a spatial domain. Numerical methods are:

>

>   *   Finite volume method (both 1st and 2nd order implemented)

>   *   BDF1 and BDF2 for time integration.

> What I have noticed is that 1st order FVM converges much faster than 2nd order FVM, regardless the time integration scheme. Well, not surprising since 2nd order FVM introduces additional non-linearity.

>

> I’m thinking about two possible ways to speed up 2nd order FVM, and would like to get some thoughts or community knowledge before jumping into code implementation.

>

> Say, let the 2nd order FVM residual function be F2(x) = 0; and the 1st order FVM residual function be F1(x) = 0.

>

>   1.  Option – 1, multi-step for each time step

> Step 1: solving F1(x) = 0 to obtain a temporary solution x1

> Step 2: feed x1 as an initial guess to solve F2(x) = 0 to obtain the final solution.

> [Not sure if gain any saving at all]

>

>

>   1.  Option -2, dynamically changing residual function F(x)

> In pseudo code, would be something like.

>

> snesFormFunction(SNES snes, Vec u, Vec f, void *)

> {

>   if (snes.nl_it_no < 4) // 4 being arbitrary here

>     f = F1(u);

>   else

>     f = F2(u);

> }

>

> I know this might be a bit crazy since it may crash after switching residual function, still, any thoughts?

>

> Best,

>

> -Ling
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20240304/754fcc73/attachment-0001.html>