[petsc-users] FEM SUPG gmres
Jed Brown
jed at jedbrown.org
Fri May 29 22:59:21 CDT 2026
The short answer is that diagonal scaling does to the *linear* solver (only) what nondimensionalization does to the nonlinear problem (and transitively the linear problem). Since PTC adjusts tolerances based on the nonlinear residual, nondimensionalization is good. One way of doing that is to simply choose units of an appropriate order of magnitude (e.g., measure pressure in bar instead of Pa).
As you're likely familiar, conservative variables are an ill-conditioned basis for representing a fluid state at low Mach number. Ma=0.4 is high enough that it's okay, but primitive is still better conditioned. There are primitive variable formulations that change the evolution equations and those that use a chain rule so that the conservation statement stays the same:
(∂q/∂u) (∂u/∂t) = ...
where (∂q/∂u) is the (matrix) of partial derivatives of conservative variables with respect to primitive variables. This is the approach of Hughes (and Shakib, Hauke, Jansen, etc.).
Note that low-Mach preconditioners often solve an auxiliary problem in the pressure space, which is somewhat easier if pressure is a state variable. Entropy variables are another option (see Hauke & Hughes 1998).
Anyway, changes of variables like these cannot be accomplished by diagonal scaling. A numerical block-diagonal scaling could be implemented, but it would only change the linear problem. You can examine the conditioning of a typical diagonal block in your matrix to quantify the effect of a change of variables.
If you're curious, see Giraldo, Restelli, and Läuter 2010 for several choices of variables and modifying the evolution equations to avoid the (∂q/∂u) matrix in exchange for changing the conservation statement (in the context of non-hydrostatic atmospheric modeling; the principles translate even if what you value in a solution is different for channel flow).
Let us know how it goes.
Nathan Langlet <nathan.langlet at onera.fr> writes:
> Hello!
>
> I am developing a conservative variable steady-state compressible RANS
> (Spalart Allmaras) solver in FENICSX with SUPG stabilisation. My aim is
> to solve 3D flows around Mach numbers of approximately 0.4 at Re~1*10^6
> in duct. My boundary conditions are implemented using flux (no strong
> dirichlet)
>
> At the moment, I am using a pseudo-transient continuation (PTC)
> algorithm where the Jacobian contains a time term but the residual is
> stationary. I am using a snes solver with FGMRES preconditioned by ASM
> + ILU(2) (rtol = 0.1, restart = 300, modifiedgradshmidt, rcm ordering,
> overlap 2, restrictive ASM, ). I solve 1 Newton iteration then modify
> the CFL ... However, as my CFL increases (before and close to quadratic
> convergence), I observe a marked deterioration in performance, and the
> solver requires >200 iterations to reach the FGMRES’s rtol of 0.1. In
> some complex cases (S shaped suct with flow separation), even 600
> iterations are insufficient.
>
> I have also noticed that some Newton iterations strangely take an
> extremely long time to complete, despite a constant or similar number of
> ILU iterations, and the only way I have found to fix this is to enable
> diagonal scaling (and the fix). In this case, the time taken to solve
> the problem is ‘logical’, and only increases as the number of iterations
> in the GMRES increases. You will find an example of this phenomenon
> attached (residual.txt). I’ve checked the time to compute the Jacobian
> and the residual, and that’s not where the problem lies.
>
> Do you think that Non-dimensionalization my variables could help GMRES
> converge more effectively ? Because, in fact, there are significant
> differences in the order of magnitude of the L2 residuals for each
> equation. Is using diagonal scaling sufficient?
>
> I would like to be able to optimise this solver to limit the number of
> iterations required for the FGMRES to converge; are there any PETSc
> options I haven’t considered? I have also noticed that FUN3D, for
> example, solves the equations in primitive variables (rho,v,T,nu_tilde);
> could this improve the convergence of the system?
>
> Thank you for any help or ideas!
>
>
> --
> *Nathan Langlet*
> Etudiant doctorant
> Département aérodynamique, aéroélasticité, acoustique
> MASH
> Tél: +33 1 46 23 51 61
>
>
> ONERA - The French Aerospace Lab - Centre de Meudon
> 8, rue des Vertugadins - 92190 MEUDON
> Nous suivre sur : https://urldefense.us/v3/__http://www.onera.fr__;!!G_uCfscf7eWS!bkC4l107PhH0Zp6uaiA2DKt8EEkFE4FxLDUZgYohn57DXKF9eRRuSYFo59clw9F__bP-jB2FQAdnrE89xNQKVNyxdN9iGlSr$ <https://urldefense.us/v3/__https://www.onera.fr__;!!G_uCfscf7eWS!bkC4l107PhH0Zp6uaiA2DKt8EEkFE4FxLDUZgYohn57DXKF9eRRuSYFo59clw9F__bP-jB2FQAdnrE89xNQKVNyxdLKIVPxi$ > | Twitter
> <https://urldefense.us/v3/__http://www.twitter.com/@onera_fr__;!!G_uCfscf7eWS!bkC4l107PhH0Zp6uaiA2DKt8EEkFE4FxLDUZgYohn57DXKF9eRRuSYFo59clw9F__bP-jB2FQAdnrE89xNQKVNyxdEpJ-GTq$ > | LinkedIn
> <https://urldefense.us/v3/__http://www.linkedin.com/company/onera__;!!G_uCfscf7eWS!bkC4l107PhH0Zp6uaiA2DKt8EEkFE4FxLDUZgYohn57DXKF9eRRuSYFo59clw9F__bP-jB2FQAdnrE89xNQKVNyxdAQlUnQd$ > | Facebook
> <https://urldefense.us/v3/__http://www.facebook.fr/thefrenchaerospacelab__;!!G_uCfscf7eWS!bkC4l107PhH0Zp6uaiA2DKt8EEkFE4FxLDUZgYohn57DXKF9eRRuSYFo59clw9F__bP-jB2FQAdnrE89xNQKVNyxdClsFfbh$ > | Instagram
> <https://urldefense.us/v3/__https://www.instagram.com/onera_the_french_aerospace_lab__;!!G_uCfscf7eWS!bkC4l107PhH0Zp6uaiA2DKt8EEkFE4FxLDUZgYohn57DXKF9eRRuSYFo59clw9F__bP-jB2FQAdnrE89xNQKVNyxdPMiWupo$ >
> Avertissement/disclaimer https://urldefense.us/v3/__https://www.onera.fr/en/emails-terms__;!!G_uCfscf7eWS!bkC4l107PhH0Zp6uaiA2DKt8EEkFE4FxLDUZgYohn57DXKF9eRRuSYFo59clw9F__bP-jB2FQAdnrE89xNQKVNyxdHPyvlfS$
>
> Ce message n'appelle pas de réponse en dehors des horaires de travail ou
> pendant vos congés.
> Iteration 11 | Résidu L2: 3.056e-02 CFL: 1.16
> Résidus L2 rel: ‖ρ‖₂ = 3.827e-02, ‖ρv‖₂ = 2.127e-02, ‖ρE‖₂ = 3.056e-02, ‖ρν‖₂ = 8.434e-03
> Résidus L2 abs: ‖ρ‖₂ = 1.252e-04, ‖ρv‖₂ = 1.486e-01, ‖ρE‖₂ = 3.815e+01, ‖ρν‖₂ = 1.265e-07
> Temps assemblage J: 8.5449378490448
> Linear ns_ solve converged due to CONVERGED_RTOL iterations 4
> temps écoulé_solve: 20.20 s
> ------------------------------------------------------------
> Iteration 12 | Résidu L2: 2.707e-02 CFL: 1.46
> Résidus L2 rel: ‖ρ‖₂ = 3.389e-02, ‖ρv‖₂ = 1.820e-02, ‖ρE‖₂ = 2.707e-02, ‖ρν‖₂ = 7.252e-03
> Résidus L2 abs: ‖ρ‖₂ = 1.109e-04, ‖ρv‖₂ = 1.271e-01, ‖ρE‖₂ = 3.379e+01, ‖ρν‖₂ = 1.087e-07
> Temps assemblage J: 8.567534923553467
> Linear ns_ solve converged due to CONVERGED_RTOL iterations 4
> temps écoulé_solve: 20.33 s
> ------------------------------------------------------------
> Iteration 13 | Résidu L2: 2.398e-02 CFL: 1.82
> Résidus L2 rel: ‖ρ‖₂ = 3.000e-02, ‖ρv‖₂ = 1.567e-02, ‖ρE‖₂ = 2.398e-02, ‖ρν‖₂ = 6.346e-03
> Résidus L2 abs: ‖ρ‖₂ = 9.815e-05, ‖ρv‖₂ = 1.095e-01, ‖ρE‖₂ = 2.993e+01, ‖ρν‖₂ = 9.516e-08
> Temps assemblage J: 8.53394341468811
> Linear ns_ solve converged due to CONVERGED_RTOL iterations 4
> temps écoulé_solve: 45.62 s
> ------------------------------------------------------------
> Iteration 14 | Résidu L2: 2.122e-02 CFL: 2.27
> Résidus L2 rel: ‖ρ‖₂ = 2.652e-02, ‖ρv‖₂ = 1.357e-02, ‖ρE‖₂ = 2.122e-02, ‖ρν‖₂ = 5.785e-03
> Résidus L2 abs: ‖ρ‖₂ = 8.677e-05, ‖ρv‖₂ = 9.483e-02, ‖ρE‖₂ = 2.649e+01, ‖ρν‖₂ = 8.674e-08
> Temps assemblage J: 8.560238122940063
> Linear ns_ solve converged due to CONVERGED_RTOL iterations 5
> temps écoulé_solve: 38.83 s
> ------------------------------------------------------------
> Iteration 15 | Résidu L2: 1.864e-02 CFL: 2.84
> Résidus L2 rel: ‖ρ‖₂ = 2.328e-02, ‖ρv‖₂ = 1.181e-02, ‖ρE‖₂ = 1.864e-02, ‖ρν‖₂ = 5.521e-03
> Résidus L2 abs: ‖ρ‖₂ = 7.616e-05, ‖ρv‖₂ = 8.247e-02, ‖ρE‖₂ = 2.326e+01, ‖ρν‖₂ = 8.280e-08
> Temps assemblage J: 8.55243706703186
> Linear ns_ solve converged due to CONVERGED_RTOL iterations 5
> temps écoulé_solve: 34.65 s
> ------------------------------------------------------------
> Iteration 16 | Résidu L2: 1.626e-02 CFL: 3.55
> Résidus L2 rel: ‖ρ‖₂ = 2.030e-02, ‖ρv‖₂ = 1.030e-02, ‖ρE‖₂ = 1.626e-02, ‖ρν‖₂ = 5.436e-03
> Résidus L2 abs: ‖ρ‖₂ = 6.643e-05, ‖ρv‖₂ = 7.196e-02, ‖ρE‖₂ = 2.030e+01, ‖ρν‖₂ = 8.152e-08
> Temps assemblage J: 8.529710054397583
> Linear ns_ solve converged due to CONVERGED_RTOL iterations 6
> temps écoulé_solve: 53.96 s
More information about the petsc-users
mailing list