[petsc-users] Overcoming slow convergence with GMRES+Hypre BoomerAMG

Matthew Knepley knepley at gmail.com
Tue Mar 21 09:37:56 CDT 2023


On Tue, Mar 21, 2023 at 10:28 AM Christopher, Joshua <jchristopher at anl.gov>
wrote:

> Hi Matt,
>
> Sorry for the unclear explanation. My layout is like this:
>
> Proc 0: Rows 0--499 and rows 1000--1499
> Proc 1: Rows 500-999 and rows 1500-1999
>

That is not a possible layout in PETSc. This is the source of the
misunderstanding. Rows are always contiguous in PETSc.

  Thanks,

     Matt


> I have two unknowns, rho and phi, both correspond to a contiguous chunk of
> rows.
>
> Phi: Rows 0-999
> Rho: Rows 1000-1999
>
> My source data (an OpenFOAM matrix) has the unknowns row-contiguous, which
> is why my layout is like this. My understanding is that my IS are set up
> correctly to match this matrix structure, which is why I am uncertain why I
> am getting the error message. I attached the output of my IS in my previous
> message.
>
> Thank you,
> Joshua
> ------------------------------
> *From:* Matthew Knepley <knepley at gmail.com>
> *Sent:* Monday, March 20, 2023 6:16 PM
> *To:* Christopher, Joshua <jchristopher at anl.gov>
> *Cc:* Barry Smith <bsmith at petsc.dev>; petsc-users at mcs.anl.gov <
> petsc-users at mcs.anl.gov>
> *Subject:* Re: [petsc-users] Overcoming slow convergence with GMRES+Hypre
> BoomerAMG
>
> On Mon, Mar 20, 2023 at 6:45 PM Christopher, Joshua via petsc-users <
> petsc-users at mcs.anl.gov> wrote:
>
> Hi Barry and Mark,
>
> Thank you for your responses. I implemented the index sets in my
> application and it appears to work in serial. Unfortunately I am having
> some trouble running in parallel. The error I am getting is:
> [1]PETSC ERROR: Petsc has generated inconsistent data
> [1]PETSC ERROR: Number of entries found in complement 1000 does not match
> expected 500
> 1]PETSC ERROR: #1 ISComplement() at
> petsc-3.16.5/src/vec/is/is/utils/iscoloring.c:837
> [1]PETSC ERROR: #2 PCSetUp_FieldSplit() at
> petsc-3.16.5/src/ksp/pc/impls/fieldsplit/fieldsplit.c:882
> [1]PETSC ERROR: #3 PCSetUp() at
> petsc-3.16.5/src/ksp/pc/interface/precon.c:1017
> [1]PETSC ERROR: #4 KSPSetUp() at
> petsc-3.16.5/src/ksp/ksp/interface/itfunc.c:408
> [1]PETSC ERROR: #5 KSPSolve_Private() at
> petsc-3.16.5/src/ksp/ksp/interface/itfunc.c:852
> [1]PETSC ERROR: #6 KSPSolve() at
> petsc-3.16.5/src/ksp/ksp/interface/itfunc.c:1086
> [1]PETSC ERROR: #7 solvePetsc() at coupled/coupledSolver.C:612
>
> I am testing with two processors and a 2000x2000 matrix. I have two
> fields, phi and rho. The matrix has rows 0-999 for phi and rows 1000-1999
> for rho. Proc0 has rows 0-499 and 1000-1499 while proc1 has rows 500-999
> and 1500-1999. I've attached the ASCII printout of the IS for phi and rho.
> Am I right thinking that I have some issue with my IS layouts?
>
>
> I do not understand your explanation. Your matrix is 2000x2000, and I
> assume split so that
>
>   proc 0 has rows 0       --   999
>   proc 1 has rows 1000 -- 1999
>
> Now, when you call PCFieldSplitSetIS(), each process gives an IS which
> indicates the dofs _owned by that process_ the contribute to field k. If you
> do not give unknowns within the global row bounds for that process, the
> ISComplement() call will not work.
>
> Of course, we should check that the entries are not out of bounds when
> they are submitted. if you want to do it, it would be a cool submission.
>
>    Thanks,
>
>       Matt
>
>
> Thank you,
> Joshua
>
>
> ------------------------------
> *From:* Barry Smith <bsmith at petsc.dev>
> *Sent:* Friday, March 17, 2023 1:22 PM
> *To:* Christopher, Joshua <jchristopher at anl.gov>
> *Cc:* petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>
> *Subject:* Re: [petsc-users] Overcoming slow convergence with GMRES+Hypre
> BoomerAMG
>
>
>
> On Mar 17, 2023, at 1:26 PM, Christopher, Joshua <jchristopher at anl.gov>
> wrote:
>
> Hi Barry,
>
> Thank you for your response. I'm a little confused about the relation
> between the IS integer values and matrix indices. From
> https://petsc.org/release/src/snes/tutorials/ex70.c.html it looks like my
> IS should just contain a list of the rows for each split? For example, if I
> have a 100x100 matrix with two fields, "rho" and "phi", the first 50 rows
> correspond to the "rho" variable and the last 50 correspond to the "phi"
> variable. So I should call PCFieldSplitSetIS twice, the first with an IS
> containing integers 0-49 and the second with integers 49-99?
> PCFieldSplitSetIS is expecting global row numbers, correct?
>
>
>   As Mark said, yes this sounds fine.
>
>
> My matrix is organized as one block after another.
>
>
>    When you are running in parallel with MPI, how will you organize the
> unknowns? Will you have 25 of the rho followed by 25 of phi on each MPI
> process? You will need to take this into account when you build the IS on
> each MPI process.
>
>   Barry
>
>
>
> Thank you,
> Joshua
> ------------------------------
> *From:* Barry Smith <bsmith at petsc.dev>
> *Sent:* Tuesday, March 14, 2023 1:35 PM
> *To:* Christopher, Joshua <jchristopher at anl.gov>
> *Cc:* petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>
> *Subject:* Re: [petsc-users] Overcoming slow convergence with GMRES+Hypre
> BoomerAMG
>
>
>   You definitely do not need to use a complicated DM to take advantage of
> PCFIELDSPLIT. All you need to do is create two IS on each MPI process. The
> first should list all the indices of the degrees of freedom of your first
> type of variable and the second should list all the rest of the degrees of
> freedom. Then use
> https://petsc.org/release/docs/manualpages/PC/PCFieldSplitSetIS/
>
>   Barry
>
> Note: PCFIELDSPLIT does not care how you have ordered your degrees of
> freedom of the two types. You might interlace them or have all the first
> degree of freedom on an MPI process and then have all the second degree of
> freedom. This just determines what your IS look like.
>
>
>
> On Mar 14, 2023, at 1:14 PM, Christopher, Joshua via petsc-users <
> petsc-users at mcs.anl.gov> wrote:
>
> Hello PETSc users,
>
> I haven't heard back from the library developer regarding the numbering
> issue or my questions on using field split operators with their library, so
> I need to fix this myself.
>
> Regarding the natural numbering vs parallel numbering: I haven't figured
> out what is wrong here. I stepped through in parallel and it looks like
> each processor is setting up the matrix and calling MatSetValue similar to
> what is shown in
> https://petsc.org/release/src/ksp/ksp/tutorials/ex2.c.html. I see that
> PETSc is recognizing my simple two-processor test from the output
> ("PetscInitialize_Common(): PETSc successfully started: number of
> processors = 2"). I'll keep poking at this, however I'm very new to PETSc.
> When I print the matrix to ASCII using PETSC_VIEWER_DEFAULT, I'm guessing I
> see one row per line, and the tuples consists of the column number and
> value?
>
> On the FieldSplit preconditioner, is my understanding here correct:
>
> To use FieldSplit, I must have a DM. Since I have an unstructured mesh, I
> must use DMPlex and set up the chart and covering relations specific to my
> mesh following here: https://petsc.org/release/docs/manual/dmplex/. I
> think this may be very time-consuming for me to set up.
>
> Currently, I already have a matrix stored in a parallel sparse L-D-U
> format. I am converting into PETSc's sparse parallel AIJ matrix (traversing
> my matrix and using MatSetValues). The weights for my discretization scheme
> are already accounted for in the coefficients of my L-D-U matrix. I do have
> the submatrices in L-D-U format for each of my two equations' coupling with
> each other. That is, the equivalent of lines 242,251-252,254 of example 28
>  https://petsc.org/release/src/snes/tutorials/ex28.c.html. Could I
> directly convert my submatrices into PETSc's sub-matrix here, then assemble
> things together so that the field split preconditioners will work?
>
> Alternatively, since my L-D-U matrices already account for the
> discretization scheme, can I use a simple structured grid DM?
>
> Thank you so much for your help!
> Regards,
> Joshua
> ------------------------------
> *From:* Pierre Jolivet <pierre at joliv.et>
> *Sent:* Friday, March 3, 2023 11:45 AM
> *To:* Christopher, Joshua <jchristopher at anl.gov>
> *Cc:* petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>
> *Subject:* Re: [petsc-users] Overcoming slow convergence with GMRES+Hypre
> BoomerAMG
>
> For full disclosure, with -ksp_pc_side right -ksp_max_it 100 -ksp_rtol
> 1E-10:
> 1) with renumbering via ParMETIS
> -pc_type bjacobi -sub_pc_type lu -sub_pc_factor_mat_solver_type mumps
> => Linear solve converged due to CONVERGED_RTOL iterations 10
> -pc_type hypre -pc_hypre_boomeramg_relax_type_down l1-Gauss-Seidel
> -pc_hypre_boomeramg_relax_type_up backward-l1-Gauss-Seidel => Linear solve
> converged due to CONVERGED_RTOL iterations 55
> 2) without renumbering via ParMETIS
> -pc_type bjacobi => Linear solve did not converge due to DIVERGED_ITS
> iterations 100
> -pc_type hypre => Linear solve did not converge due to DIVERGED_ITS
> iterations 100
> Using on outer fieldsplit may help fix this.
>
> Thanks,
> Pierre
>
> On 3 Mar 2023, at 6:24 PM, Christopher, Joshua via petsc-users <
> petsc-users at mcs.anl.gov> wrote:
>
> I am solving these equations in the context of electrically-driven fluid
> flows as that first paper describes. I am using a PIMPLE scheme to advance
> the fluid equations in time, and my goal is to do a coupled solve of the
> electric equations similar to what is described in this paper:
> https://www.sciencedirect.com/science/article/pii/S0045793019302427. They
> are using the SIMPLE scheme in this paper. My fluid flow should eventually
> reach steady behavior, and likewise the time derivative in the charge
> density should trend towards zero. They preferred using BiCGStab with a
> direct LU preconditioner for solving their electric equations. I tried to
> test that combination, but my case is halting for unknown reasons in the
> middle of the PETSc solve. I'll try with more nodes and see if I am running
> out of memory, but the computer is a little overloaded at the moment so it
> may take a while to run.
>
> I sent Pierre Jolivet my matrix and RHS, and they said the matrix does not
> appear to be following a parallel numbering, and instead looks like the
> matrix has natural numbering. When they renumbered the system with ParMETIS
> they got really fast convergence. I am using PETSc through a library, so I
> will reach out to the library authors and see if there is an issue in the
> library.
>
> Thank you,
> Joshua
> ------------------------------
> *From:* Barry Smith <bsmith at petsc.dev>
> *Sent:* Thursday, March 2, 2023 3:47 PM
> *To:* Christopher, Joshua <jchristopher at anl.gov>
> *Cc:* petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>
> *Subject:* Re: [petsc-users] Overcoming slow convergence with GMRES+Hypre
> BoomerAMG
>
>
>
>
> <Untitled.png>
>
>   Are you solving this as a time-dependent problem? Using an implicit
> scheme (like backward Euler) for rho ? In ODE language, solving the
> differential algebraic equation?
>
> Is epsilon bounded away from 0?
>
> On Mar 2, 2023, at 4:22 PM, Christopher, Joshua <jchristopher at anl.gov>
> wrote:
>
> Hi Barry and Mark,
>
> Thank you for looking into my problem. The two equations I am solving with
> PETSc are equations 6 and 7 from this paper:
> https://ris.utwente.nl/ws/portalfiles/portal/5676495/Roghair+Paper_final_draft_v1.pdf
>
> I just used MUMPS and SuperLU_DIST on my full-size problem (with 3,000,000
> unknowns). To clarify, I did a direct solve with -ksp_type preonly. They
> take a very long time, about 30 minutes for MUMPS and 18 minutes for
> SuperLU_DIST, see attached output. For reference, the same matrix took 658
> iterations of BoomerAMG and about 20 seconds of walltime. Maybe I am
> already getting a great deal with BoomerAMG!
>
> I'll try removing some terms from my solve (e.g. removing the second
> equation, then making the second equation just the elliptic portion of the
> equation, etc.) and try with a simpler geometry. I'll keep you updated as I
> run into troubles with that route. I wasn't aware of Field Split
> preconditioners, I'll do some reading on them and give them a try as well.
>
> Thank you again,
> Joshua
> ------------------------------
>
> *From:* Barry Smith <bsmith at petsc.dev>
> *Sent:* Thursday, March 2, 2023 7:47 AM
> *To:* Christopher, Joshua <jchristopher at anl.gov>
> *Cc:* petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>
> *Subject:* Re: [petsc-users] Overcoming slow convergence with GMRES+Hypre
> BoomerAMG
>
>
>   Have you tried MUMPS (or SuperLU_DIST) on the full-size problem with the
> 5,000,000 unknowns? It is at the high end of problem sizes you can do with
> direct solvers but is worth comparing with  BoomerAMG. You likely want to
> use more nodes and fewer cores per node with MUMPs to be able to access
> more memory. If you are needing to solve multiple right hand sides but with
> the same matrix the factors will be reused resulting in the second and
> later solves being much faster.
>
>   I agree with Mark, with iterative solvers you are likely to end up with
> PCFIELDSPLIT.
>
>   Barry
>
>
> On Mar 1, 2023, at 7:17 PM, Christopher, Joshua via petsc-users <
> petsc-users at mcs.anl.gov> wrote:
>
> Hello,
>
> I am trying to solve the leaky-dielectric model equations with PETSc using
> a second-order discretization scheme (with limiting to first order as
> needed) using the finite volume method. The leaky dielectric model is a
> coupled system of two equations, consisting of a Poisson equation and a
> convection-diffusion equation.  I have tested on small problems with simple
> geometry (~1000 DoFs) using:
>
> -ksp_type gmres
> -pc_type hypre
> -pc_hypre_type boomeramg
>
> and I get RTOL convergence to 1.e-5 in about 4 iterations. I tested this
> in parallel with 2 cores, but also previously was able to use successfully
> use a direct solver in serial to solve this problem. When I scale up to my
> production problem, I get significantly worse convergence. My production
> problem has ~3 million DoFs, more complex geometry, and is solved on ~100
> cores across two nodes. The boundary conditions change a little because of
> the geometry, but are of the same classifications (e.g. only Dirichlet and
> Neumann). On the production case, I am needing 600-4000 iterations to
> converge. I've attached the output from the first solve that took 658
> iterations to converge, using the following output options:
>
> -ksp_view_pre
> -ksp_view
> -ksp_converged_reason
> -ksp_monitor_true_residual
> -ksp_test_null_space
>
> My matrix is non-symmetric, the condition number can be around 10e6, and
> the eigenvalues reported by PETSc have been real and positive (using
> -ksp_view_eigenvalues).
>
> I have tried using other preconditions (superlu, mumps, gamg, mg) but
> hypre+boomeramg has performed the best so far. The literature seems to
> indicate that AMG is the best approach for solving these equations in a
> coupled fashion.
>
> Do you have any advice on speeding up the convergence of this system?
>
> Thank you,
> Joshua
> <petsc_gmres_boomeramg.txt>
>
>
> <petsc_preonly_mumps.txt><petsc_preonly_superlu.txt>
>
>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> <http://www.cse.buffalo.edu/~knepley/>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20230321/39e56ca8/attachment-0001.html>


More information about the petsc-users mailing list