[petsc-users] Overcoming slow convergence with GMRES+Hypre BoomerAMG
Christopher, Joshua
jchristopher at anl.gov
Tue Mar 21 09:28:29 CDT 2023
Hi Matt,
Sorry for the unclear explanation. My layout is like this:
Proc 0: Rows 0--499 and rows 1000--1499
Proc 1: Rows 500-999 and rows 1500-1999
I have two unknowns, rho and phi, both correspond to a contiguous chunk of rows.
Phi: Rows 0-999
Rho: Rows 1000-1999
My source data (an OpenFOAM matrix) has the unknowns row-contiguous, which is why my layout is like this. My understanding is that my IS are set up correctly to match this matrix structure, which is why I am uncertain why I am getting the error message. I attached the output of my IS in my previous message.
Thank you,
Joshua
________________________________
From: Matthew Knepley <knepley at gmail.com>
Sent: Monday, March 20, 2023 6:16 PM
To: Christopher, Joshua <jchristopher at anl.gov>
Cc: Barry Smith <bsmith at petsc.dev>; petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>
Subject: Re: [petsc-users] Overcoming slow convergence with GMRES+Hypre BoomerAMG
On Mon, Mar 20, 2023 at 6:45 PM Christopher, Joshua via petsc-users <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>> wrote:
Hi Barry and Mark,
Thank you for your responses. I implemented the index sets in my application and it appears to work in serial. Unfortunately I am having some trouble running in parallel. The error I am getting is:
[1]PETSC ERROR: Petsc has generated inconsistent data
[1]PETSC ERROR: Number of entries found in complement 1000 does not match expected 500
1]PETSC ERROR: #1 ISComplement() at petsc-3.16.5/src/vec/is/is/utils/iscoloring.c:837
[1]PETSC ERROR: #2 PCSetUp_FieldSplit() at petsc-3.16.5/src/ksp/pc/impls/fieldsplit/fieldsplit.c:882
[1]PETSC ERROR: #3 PCSetUp() at petsc-3.16.5/src/ksp/pc/interface/precon.c:1017
[1]PETSC ERROR: #4 KSPSetUp() at petsc-3.16.5/src/ksp/ksp/interface/itfunc.c:408
[1]PETSC ERROR: #5 KSPSolve_Private() at petsc-3.16.5/src/ksp/ksp/interface/itfunc.c:852
[1]PETSC ERROR: #6 KSPSolve() at petsc-3.16.5/src/ksp/ksp/interface/itfunc.c:1086
[1]PETSC ERROR: #7 solvePetsc() at coupled/coupledSolver.C:612
I am testing with two processors and a 2000x2000 matrix. I have two fields, phi and rho. The matrix has rows 0-999 for phi and rows 1000-1999 for rho. Proc0 has rows 0-499 and 1000-1499 while proc1 has rows 500-999 and 1500-1999. I've attached the ASCII printout of the IS for phi and rho. Am I right thinking that I have some issue with my IS layouts?
I do not understand your explanation. Your matrix is 2000x2000, and I assume split so that
proc 0 has rows 0 -- 999
proc 1 has rows 1000 -- 1999
Now, when you call PCFieldSplitSetIS(), each process gives an IS which indicates the dofs _owned by that process_ the contribute to field k. If you
do not give unknowns within the global row bounds for that process, the ISComplement() call will not work.
Of course, we should check that the entries are not out of bounds when they are submitted. if you want to do it, it would be a cool submission.
Thanks,
Matt
Thank you,
Joshua
________________________________
From: Barry Smith <bsmith at petsc.dev<mailto:bsmith at petsc.dev>>
Sent: Friday, March 17, 2023 1:22 PM
To: Christopher, Joshua <jchristopher at anl.gov<mailto:jchristopher at anl.gov>>
Cc: petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov> <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>>
Subject: Re: [petsc-users] Overcoming slow convergence with GMRES+Hypre BoomerAMG
On Mar 17, 2023, at 1:26 PM, Christopher, Joshua <jchristopher at anl.gov<mailto:jchristopher at anl.gov>> wrote:
Hi Barry,
Thank you for your response. I'm a little confused about the relation between the IS integer values and matrix indices. Fromhttps://petsc.org/release/src/snes/tutorials/ex70.c.html it looks like my IS should just contain a list of the rows for each split? For example, if I have a 100x100 matrix with two fields, "rho" and "phi", the first 50 rows correspond to the "rho" variable and the last 50 correspond to the "phi" variable. So I should call PCFieldSplitSetIS twice, the first with an IS containing integers 0-49 and the second with integers 49-99? PCFieldSplitSetIS is expecting global row numbers, correct?
As Mark said, yes this sounds fine.
My matrix is organized as one block after another.
When you are running in parallel with MPI, how will you organize the unknowns? Will you have 25 of the rho followed by 25 of phi on each MPI process? You will need to take this into account when you build the IS on each MPI process.
Barry
Thank you,
Joshua
________________________________
From: Barry Smith <bsmith at petsc.dev<mailto:bsmith at petsc.dev>>
Sent: Tuesday, March 14, 2023 1:35 PM
To: Christopher, Joshua <jchristopher at anl.gov<mailto:jchristopher at anl.gov>>
Cc: petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov> <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>>
Subject: Re: [petsc-users] Overcoming slow convergence with GMRES+Hypre BoomerAMG
You definitely do not need to use a complicated DM to take advantage of PCFIELDSPLIT. All you need to do is create two IS on each MPI process. The first should list all the indices of the degrees of freedom of your first type of variable and the second should list all the rest of the degrees of freedom. Then use https://petsc.org/release/docs/manualpages/PC/PCFieldSplitSetIS/
Barry
Note: PCFIELDSPLIT does not care how you have ordered your degrees of freedom of the two types. You might interlace them or have all the first degree of freedom on an MPI process and then have all the second degree of freedom. This just determines what your IS look like.
On Mar 14, 2023, at 1:14 PM, Christopher, Joshua via petsc-users <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>> wrote:
Hello PETSc users,
I haven't heard back from the library developer regarding the numbering issue or my questions on using field split operators with their library, so I need to fix this myself.
Regarding the natural numbering vs parallel numbering: I haven't figured out what is wrong here. I stepped through in parallel and it looks like each processor is setting up the matrix and calling MatSetValue similar to what is shown in https://petsc.org/release/src/ksp/ksp/tutorials/ex2.c.html. I see that PETSc is recognizing my simple two-processor test from the output ("PetscInitialize_Common(): PETSc successfully started: number of processors = 2"). I'll keep poking at this, however I'm very new to PETSc. When I print the matrix to ASCII using PETSC_VIEWER_DEFAULT, I'm guessing I see one row per line, and the tuples consists of the column number and value?
On the FieldSplit preconditioner, is my understanding here correct:
To use FieldSplit, I must have a DM. Since I have an unstructured mesh, I must use DMPlex and set up the chart and covering relations specific to my mesh following here: https://petsc.org/release/docs/manual/dmplex/. I think this may be very time-consuming for me to set up.
Currently, I already have a matrix stored in a parallel sparse L-D-U format. I am converting into PETSc's sparse parallel AIJ matrix (traversing my matrix and using MatSetValues). The weights for my discretization scheme are already accounted for in the coefficients of my L-D-U matrix. I do have the submatrices in L-D-U format for each of my two equations' coupling with each other. That is, the equivalent of lines 242,251-252,254 of example 28 https://petsc.org/release/src/snes/tutorials/ex28.c.html. Could I directly convert my submatrices into PETSc's sub-matrix here, then assemble things together so that the field split preconditioners will work?
Alternatively, since my L-D-U matrices already account for the discretization scheme, can I use a simple structured grid DM?
Thank you so much for your help!
Regards,
Joshua
________________________________
From: Pierre Jolivet <pierre at joliv.et<mailto:pierre at joliv.et>>
Sent: Friday, March 3, 2023 11:45 AM
To: Christopher, Joshua <jchristopher at anl.gov<mailto:jchristopher at anl.gov>>
Cc: petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov> <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>>
Subject: Re: [petsc-users] Overcoming slow convergence with GMRES+Hypre BoomerAMG
For full disclosure, with -ksp_pc_side right -ksp_max_it 100 -ksp_rtol 1E-10:
1) with renumbering via ParMETIS
-pc_type bjacobi -sub_pc_type lu -sub_pc_factor_mat_solver_type mumps => Linear solve converged due to CONVERGED_RTOL iterations 10
-pc_type hypre -pc_hypre_boomeramg_relax_type_down l1-Gauss-Seidel -pc_hypre_boomeramg_relax_type_up backward-l1-Gauss-Seidel => Linear solve converged due to CONVERGED_RTOL iterations 55
2) without renumbering via ParMETIS
-pc_type bjacobi => Linear solve did not converge due to DIVERGED_ITS iterations 100
-pc_type hypre => Linear solve did not converge due to DIVERGED_ITS iterations 100
Using on outer fieldsplit may help fix this.
Thanks,
Pierre
On 3 Mar 2023, at 6:24 PM, Christopher, Joshua via petsc-users <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>> wrote:
I am solving these equations in the context of electrically-driven fluid flows as that first paper describes. I am using a PIMPLE scheme to advance the fluid equations in time, and my goal is to do a coupled solve of the electric equations similar to what is described in this paper: https://www.sciencedirect.com/science/article/pii/S0045793019302427. They are using the SIMPLE scheme in this paper. My fluid flow should eventually reach steady behavior, and likewise the time derivative in the charge density should trend towards zero. They preferred using BiCGStab with a direct LU preconditioner for solving their electric equations. I tried to test that combination, but my case is halting for unknown reasons in the middle of the PETSc solve. I'll try with more nodes and see if I am running out of memory, but the computer is a little overloaded at the moment so it may take a while to run.
I sent Pierre Jolivet my matrix and RHS, and they said the matrix does not appear to be following a parallel numbering, and instead looks like the matrix has natural numbering. When they renumbered the system with ParMETIS they got really fast convergence. I am using PETSc through a library, so I will reach out to the library authors and see if there is an issue in the library.
Thank you,
Joshua
________________________________
From: Barry Smith <bsmith at petsc.dev<mailto:bsmith at petsc.dev>>
Sent: Thursday, March 2, 2023 3:47 PM
To: Christopher, Joshua <jchristopher at anl.gov<mailto:jchristopher at anl.gov>>
Cc: petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov> <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>>
Subject: Re: [petsc-users] Overcoming slow convergence with GMRES+Hypre BoomerAMG
<Untitled.png>
Are you solving this as a time-dependent problem? Using an implicit scheme (like backward Euler) for rho ? In ODE language, solving the differential algebraic equation?
Is epsilon bounded away from 0?
On Mar 2, 2023, at 4:22 PM, Christopher, Joshua <jchristopher at anl.gov<mailto:jchristopher at anl.gov>> wrote:
Hi Barry and Mark,
Thank you for looking into my problem. The two equations I am solving with PETSc are equations 6 and 7 from this paper:https://ris.utwente.nl/ws/portalfiles/portal/5676495/Roghair+Paper_final_draft_v1.pdf
I just used MUMPS and SuperLU_DIST on my full-size problem (with 3,000,000 unknowns). To clarify, I did a direct solve with -ksp_type preonly. They take a very long time, about 30 minutes for MUMPS and 18 minutes for SuperLU_DIST, see attached output. For reference, the same matrix took 658 iterations of BoomerAMG and about 20 seconds of walltime. Maybe I am already getting a great deal with BoomerAMG!
I'll try removing some terms from my solve (e.g. removing the second equation, then making the second equation just the elliptic portion of the equation, etc.) and try with a simpler geometry. I'll keep you updated as I run into troubles with that route. I wasn't aware of Field Split preconditioners, I'll do some reading on them and give them a try as well.
Thank you again,
Joshua
________________________________
From: Barry Smith <bsmith at petsc.dev<mailto:bsmith at petsc.dev>>
Sent: Thursday, March 2, 2023 7:47 AM
To: Christopher, Joshua <jchristopher at anl.gov<mailto:jchristopher at anl.gov>>
Cc: petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov> <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>>
Subject: Re: [petsc-users] Overcoming slow convergence with GMRES+Hypre BoomerAMG
Have you tried MUMPS (or SuperLU_DIST) on the full-size problem with the 5,000,000 unknowns? It is at the high end of problem sizes you can do with direct solvers but is worth comparing with BoomerAMG. You likely want to use more nodes and fewer cores per node with MUMPs to be able to access more memory. If you are needing to solve multiple right hand sides but with the same matrix the factors will be reused resulting in the second and later solves being much faster.
I agree with Mark, with iterative solvers you are likely to end up with PCFIELDSPLIT.
Barry
On Mar 1, 2023, at 7:17 PM, Christopher, Joshua via petsc-users <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>> wrote:
Hello,
I am trying to solve the leaky-dielectric model equations with PETSc using a second-order discretization scheme (with limiting to first order as needed) using the finite volume method. The leaky dielectric model is a coupled system of two equations, consisting of a Poisson equation and a convection-diffusion equation. I have tested on small problems with simple geometry (~1000 DoFs) using:
-ksp_type gmres
-pc_type hypre
-pc_hypre_type boomeramg
and I get RTOL convergence to 1.e-5 in about 4 iterations. I tested this in parallel with 2 cores, but also previously was able to use successfully use a direct solver in serial to solve this problem. When I scale up to my production problem, I get significantly worse convergence. My production problem has ~3 million DoFs, more complex geometry, and is solved on ~100 cores across two nodes. The boundary conditions change a little because of the geometry, but are of the same classifications (e.g. only Dirichlet and Neumann). On the production case, I am needing 600-4000 iterations to converge. I've attached the output from the first solve that took 658 iterations to converge, using the following output options:
-ksp_view_pre
-ksp_view
-ksp_converged_reason
-ksp_monitor_true_residual
-ksp_test_null_space
My matrix is non-symmetric, the condition number can be around 10e6, and the eigenvalues reported by PETSc have been real and positive (using -ksp_view_eigenvalues).
I have tried using other preconditions (superlu, mumps, gamg, mg) but hypre+boomeramg has performed the best so far. The literature seems to indicate that AMG is the best approach for solving these equations in a coupled fashion.
Do you have any advice on speeding up the convergence of this system?
Thank you,
Joshua
<petsc_gmres_boomeramg.txt>
<petsc_preonly_mumps.txt><petsc_preonly_superlu.txt>
--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener
https://www.cse.buffalo.edu/~knepley/<http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20230321/af2ae491/attachment-0001.html>
More information about the petsc-users
mailing list