[petsc-users] Slow convergence while parallel computations.

Thu Sep 2 07:59:10 CDT 2021

On Thu, Sep 2, 2021 at 8:08 AM Viktor Nazdrachev <numbersixvs at gmail.com>
wrote:

> Hello, Pierre!
>
> Thank you for your response!
>
> I attached log files (txt files with convergence behavior and RAM usage
> log in separate txt files) and resulting table with convergence
> investigation data(xls). Data for main non-regular grid with 500K cells and
> heterogeneous properties are in 500K folder, whereas data for simple
> uniform 125K cells grid with constant properties are in 125K folder.
>
>
>
> >Dear Viktor,
>
> >
>
> >>* On 1 Sep 2021, at 10:42 AM, **Наздрачёв* *Виктор** <**numbersixvs at
> gmail.com <https://lists.mcs.anl.gov/mailman/listinfo/petsc-users>**>
> >wrote:*
>
> *>*>
>
> *>*>* Dear all,*
>
> *>*>
>
> *>*>* I have a 3D elasticity problem with heterogeneous properties. There
> is unstructured grid with aspect ratio varied from 4 to 25. Zero Dirichlet
> BCs  are imposed on bottom face of mesh. Also, Neumann (traction) BCs are
> imposed on side faces. Gravity load is also accounted for. The grid I use
> consists of 500k cells (which is approximately 1.6M of DOFs).*
>
> *>*>
>
> *>*>* The best performance and memory usage for single MPI process was
> obtained with HPDDM(BFBCG) solver*
>
> *>*>
>
> *>*Block Krylov solvers are (most often) only useful if you have multiple
> right-hand sides, e.g., in the context of elasticity, multiple loadings.
>
> Is that really the case? If not, you may as well stick to “standard” CG
> instead of the breakdown-free block (BFB) variant.
>
> *> *
>
>
>
> In that case only single right-hand side is utilized, so I switched to
> “standard” cg solver (-ksp_hpddm_type cg), but I noticed the interesting
> convergence behavior. For non-regular grid with 500K cells and
> heterogeneous properties CG  solver converged with 1 iteration
> (log_hpddm(cg)_gamg_nearnullspace_1_mpi.txt), but for more simple uniform
> grid with 125K cells and homogeneous properties CG solves linear system
> successfully(log_hpddm(cg)_gamg_nearnullspace_1_mpi.txt).
>
> BFBCG solver works properly for both grids.
>
>
>
>
>
> *>*>* and bjacobian + ICC (1) in subdomains as preconditioner, it took 1
> m 45 s and RAM 5.0 GB. Parallel computation with 4 MPI processes took 2 m
> 46 s when using 5.6 GB of RAM. This because of number of iterations
> required to achieve the same tolerance is significantly increased.*
>
> *>*>
>
> *>*>* I`ve also tried PCGAMG (agg) preconditioner with IC**С** (1)
> sub-precondtioner. For single MPI process, the calculation took 10 min and
> 3.4 GB of RAM. To improve the convergence rate, the nullspace was attached
> using MatNullSpaceCreateRigidBody and MatSetNearNullSpace subroutines.
> This has reduced calculation time to 3 m 58 s when using 4.3 GB of RAM.
> Also, there is peak memory usage with 14.1 GB, which appears just before
> the start of the iterations. Parallel computation with 4 MPI processes took
> 2 m 53 s when using 8.4 GB of RAM. In that case the peak memory usage is
> about 22 GB.*
>
> >*> *
>
> *>*I’m surprised that GAMG is converging so slowly. What do you mean by
> "ICC(1) sub-preconditioner"? Do you use that as a smoother or as a coarse
> level solver?
>
> *>*
>
>
> Sorry for misleading, ICC is used only for BJACOBI preconditioner, no ICC
> for GAMG.
>
>
>
> *>*How many iterations are required to reach convergence?
>
> *>*Could you please maybe run the solver with -ksp_view -log_view and
> send us the output?
>
> *>*
>
>
>
> For case with 4 MPI processes and attached nullspace it is required 177
> iterations
>

Pierre's suggestions are good ones.

I am confused by the failure of GAMG, since 177 iterations is not good.
Something is breaking down, either the smoother or the accuracy of the
coarse grids.
Can you give me an idea what your coefficient looks like?

  Thanks,

     Matt

> to reach convergence (you may see detailed log in
> log_hpddm(bfbcg)_gamg_nearnullspace_4_mpi.txt and memory usage log in
> RAM_log_hpddm(bfbcg)_gamg_nearnullspace_4_mpi.txt). For comparison, 90
> iterations are required for sequential
> run(log_hpddm(bfbcg)_gamg_nearnullspace_1_mpi.txt).
>
>
>
> *>*Most of the default parameters of GAMG should be good enough for 3D
> elasticity, provided that your MatNullSpace is correct.
>
> *>*
>
>
>
> How can I be sure that nullspace is attached correctly? Is there any way
> for self-checking (Well perhaps calculate some parameters using matrix and
> solution vector)?
>
>
>
> *>*One parameter that may need some adjustments though is the aggregation
> threshold -pc_gamg_threshold (you could try values in the [0.01; 0.1]
> range, that’s what I always use for elasticity problems).
>
> *> *
>
>
>
> Tried to find optimal value of this option, set -pc_gamg_threshold 0.01
> and -pc_gamg_threshold_scale 2, but I didn't notice any significant
> changes (Need more time for experiments )
>
>
> Kind regards,
>
>
>
> Viktor Nazdrachev
>
>
>
> R&D senior researcher
>
>
>
> Geosteering Technologies LLC
>
> ср, 1 сент. 2021 г. в 12:01, Pierre Jolivet <pierre at joliv.et>:
>
>> Dear Viktor,
>>
>> On 1 Sep 2021, at 10:42 AM, Наздрачёв Виктор <numbersixvs at gmail.com>
>> wrote:
>>
>> Dear all,
>>
>> I have a 3D elasticity problem with heterogeneous properties. There is
>> unstructured grid with aspect ratio varied from 4 to 25. Zero Dirichlet
>> BCs  are imposed on bottom face of mesh. Also, Neumann (traction) BCs are
>> imposed on side faces. Gravity load is also accounted for. The grid I use
>> consists of 500k cells (which is approximately 1.6M of DOFs).
>>
>> The best performance and memory usage for single MPI process was obtained
>> with HPDDM(BFBCG) solver
>>
>> Block Krylov solvers are (most often) only useful if you have multiple
>> right-hand sides, e.g., in the context of elasticity, multiple loadings.
>> Is that really the case? If not, you may as well stick to “standard” CG
>> instead of the breakdown-free block (BFB) variant.
>>
>> and bjacobian + ICC (1) in subdomains as preconditioner, it took 1 m 45 s
>> and RAM 5.0 GB. Parallel computation with 4 MPI processes took 2 m 46 s
>> when using 5.6 GB of RAM. This because of number of iterations required to
>> achieve the same tolerance is significantly increased.
>>
>> I`ve also tried PCGAMG (agg) preconditioner with ICС (1)
>> sub-precondtioner. For single MPI process, the calculation took 10 min and
>> 3.4 GB of RAM. To improve the convergence rate, the nullspace was attached
>> using MatNullSpaceCreateRigidBody and MatSetNearNullSpace subroutines.
>> This has reduced calculation time to 3 m 58 s when using 4.3 GB of RAM.
>> Also, there is peak memory usage with 14.1 GB, which appears just before
>> the start of the iterations. Parallel computation with 4 MPI processes took
>> 2 m 53 s when using 8.4 GB of RAM. In that case the peak memory usage is
>> about 22 GB.
>>
>> I’m surprised that GAMG is converging so slowly. What do you mean by
>> "ICC(1) sub-preconditioner"? Do you use that as a smoother or as a coarse
>> level solver?
>> How many iterations are required to reach convergence?
>> Could you please maybe run the solver with -ksp_view -log_view and send
>> us the output?
>> Most of the default parameters of GAMG should be good enough for 3D
>> elasticity, provided that your MatNullSpace is correct.
>> One parameter that may need some adjustments though is the aggregation
>> threshold -pc_gamg_threshold (you could try values in the [0.01; 0.1]
>> range, that’s what I always use for elasticity problems).
>>
>> Thanks,
>> Pierre
>>
>> Are there ways to avoid decreasing of the convergence rate for bjacobi
>> precondtioner in parallel mode? Does it make sense to use hierarchical or
>> nested krylov methods with a local gmres solver (sub_pc_type gmres) and
>> some sub-precondtioner (for example, sub_pc_type bjacobi)?
>>
>>
>> Is this peak memory usage expected for gamg preconditioner? is there any
>> way to reduce it?
>>
>>
>> What advice would you give to improve the convergence rate with multiple
>> MPI processes, but keep memory consumption reasonable?
>>
>>
>> Kind regards,
>>
>> Viktor Nazdrachev
>>
>> R&D senior researcher
>>
>> Geosteering Technologies LLC
>>
>>
>>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210902/7f05d88f/attachment-0001.html>