[petsc-users] Triple increasing of allocated memory during KSPSolve calling(GMRES preconditioned by ASM)

Wed Feb 5 09:46:29 CST 2020

On Wed, Feb 5, 2020 at 10:04 AM Дмитрий Мельничук <
dmitry.melnichuk at geosteertech.com> wrote:

> Barry, appreciate your response, as always.
>
> - You are saying that I am using ASM + ILU(0). However, I use PETSc only
> with "ASM" as the input parameter for preconditioner. Does it mean that
> ILU(0) is default sub-preconditioner for ASM?
>

Yes.

> Can I change it using the option "-sub_pc_type"?
>

Yes

Does it make sense to you within the scope of my general goal, which is
> memory consumption decrease? Can it be useful to vary the "-sub_ksp_type"
> option?
>

Yes. For example, try measuring the memory usage with -sub_pc_type jacobi

> - I have run the computation for the same initial matrix with the
> "-sub_pc_factor_in_place" option, PC = ASM. Now the process consumed ~400
> MB comparing to 550 MB without this option.
> I used "-ksp_view" for this computation, two logs for this computation are
> attached:
> "*ksp_view.txt"  - *ksp_view option only
> *"full_log_ASM_factor_in_place.txt"* - full log without ksp_view option
>
> - Then I changed primary preconditioner from ASM to ILU(0) and ran the
> computation: memory consumption was again about ~400 MB, no matter if I use
> the "-sub_pc_factor_in_place" option.
>
> - Then I tried to run the computation with ILU(0) and
> "-pc_factor_in_place", just in case: the computation did not start, I got
> an error message, the log is attached:* "Error_ilu_pc_factor.txt"*
>
> - Then I ran the computation with SOR as a preconditioner. PETSc gave me
> an error message, the log is attached: *"Error_gmres_sor.txt"*
>
> - As for the kind of PDEs: I am solving the standard poroelasticity
> problem, the formulation can be found in the attached paper
> (Zheng_poroelasticity.pdf), pages 2-3.
> The file PDE.jpg is a snapshot of PDEs from this paper.
>

Proelasticity is elliptic (the kind that I am familiar with), so I would at
least try Algebraic Multigrid, either GAMG, or ML, or Hypre (probably try
all of them).

  Thanks,

     Matt

>
> So, if you may give me any further advice on how to decrease the consumed
> amount of memory to approximately the matrix size (~200 MB in this case),
> it will be great. Do I need to focus on searching a proper preconditioner?
> BTW, the single ILU(0) did not give me any memory advantage comparing to
> ASM with "-sub_pc_factor_in_place".
>
> Have a pleasant day!
>
> Kind regards,
> Dmitry
>
>
>
> 04.02.2020, 19:04, "Smith, Barry F." <bsmith at mcs.anl.gov>:
>
>
>    Please run with the option -ksp_view so we know the exact solver
> options you are using.
>
>    From the lines
>
> MatCreateSubMats 1 1.0 1.9397e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00
> 0 0 0 0 0 0 0 0 0 0 0
> MatGetOrdering 1 1.0 1.1066e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0
> 0 0 0 0 0 0 0 0 0 0
> MatIncreaseOvrlp 1 1.0 3.0324e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00
> 0 0 0 0 0 0 0 0 0 0 0
>
>    and the fact you have three matrices I would guess you are using the
> additive Schwarz preconditioner (ASM) with ILU(0) on the blocks. (which
> converges the same as ILU on one process but does use much more memory).
>
>    Note: your code is still built with 32 bit integers.
>
>    I would guess the basic matrix formed plus the vectors in this example
> could take ~200 MB. It is the two matrices in the additive Schwarz that is
> taking the additional memory.
>
>    What kind of PDEs are you solving and what kind of formulation?
>
>    ASM plus ILU is the "work mans" type preconditioner, relatively robust
> but not particularly fast for convergence. Depending on your problem you
> might be able to do much better convergence wise by using a PCFIELDSPLIT
> and a PCGAMG on one of the splits. In your own run you see the ILU is
> chugging along rather slowly to the solution.
>
>    With your current solvers you can use the option
> -sub_pc_factor_in_place which will shave off one of the matrices memories.
> Please try that.
>
>    Avoiding the ASM you can avoid both extra matrices but at the cost of
> even slower convergence. Use, for example -pc_type sor
>
>
>     The petroleum industry also has a variety of "custom"
> preconditioners/solvers for particular models and formulations that can
> beat the convergence of general purpose solvers; and require less memory.
> Some of these can be implemented or simulated with PETSc. Some of these are
> implemented in the commercial petroleum simulation codes and it can be
> difficult to get a handle on exactly what they do because of proprietary
> issues. I think I have an old text on these approaches in my office, there
> may be modern books that discuss these.
>
>
>    Barry
>
>
>
>
>  On Feb 4, 2020, at 6:04 AM, Дмитрий Мельничук <
> dmitry.melnichuk at geosteertech.com> wrote:
>
>  Hello again!
>  Thank you very much for your replies!
>  Log is attached.
>
>  1. The main problem now is following. To solve the matrix that is
> attached to my previous e-mail PETSc consumes ~550 MB.
>  I know definitely that there are commercial softwares in petroleum
> industry (e.g., Schlumberger Petrel) that solve the same initial problem
> consuming only ~200 MB.
>  Moreover, I am sure that when I used 32-bit PETSc (GMRES + ASM) a year
> ago, it also consumed ~200 MB for this matrix.
>
>  So, my question is: do you have any advice on how to decrease the amount
> of RAM consumed for such matrix from 550 MB to 200 MB? Maybe some specific
> preconditioner or other ways?
>
>  I will be very grateful for any thoughts!
>
>  2. The second problem is more particular.
>  According to resource manager in Windows 10, Fortran solver based on
> PETSc consumes 548 MB RAM while solving the system of linear equations.
>  As I understand it form logs, it is required 459 MB and 52 MB for matrix
> and vector storage respectively. After summing of all objects for which
> memory is allocated we get only 517 MB.
>
>  Thank you again for your time! Have a nice day.
>
>  Kind regards,
>  Dmitry
>
>
>  03.02.2020, 19:55, "Smith, Barry F." <bsmith at mcs.anl.gov>:
>
>     GMRES also can by default require about 35 work vectors if it reaches
> the full restart. You can set a smaller restart with -ksp_gmres_restart 15
> for example but this can also hurt the convergence of GMRES dramatically.
> People sometimes use the KSPBCGS algorithm since it does not require all
> the restart vectors but it can also converge more slowly.
>
>      Depending on how much memory the sparse matrices use relative to the
> vectors the vector memory may matter or not.
>
>     If you are using a recent version of PETSc you can run with -log_view
> -log_view_memory and it will show on the right side of the columns how much
> memory is being allocated for each of the operations in various ways.
>
>     Barry
>
>
>
>   On Feb 3, 2020, at 10:34 AM, Matthew Knepley <knepley at gmail.com> wrote:
>
>   On Mon, Feb 3, 2020 at 10:38 AM Дмитрий Мельничук <
> dmitry.melnichuk at geosteertech.com> wrote:
>   Hello all!
>
>   Now I am faced with a problem associated with the memory allocation when
> calling of KSPSolve .
>
>   GMRES preconditioned by ASM for solving linear algebra system (obtained
> by the finite element spatial discretisation of Biot poroelasticity model)
> was chosen.
>   According to the output value of PetscMallocGetCurrentUsage subroutine
> 176 MB for matrix and RHS vector storage is required (before KSPSolve
> calling).
>   But during solving linear algebra system 543 MB of RAM is required
> (during KSPSolve calling).
>   Thus, the amount of allocated memory after preconditioning stage
> increased three times. This kind of behaviour is critically for 3D models
> with several millions of cells.
>
>   1) In order to know anything, we have to see the output of -ksp_view,
> although I see you used an overlap of 2
>
>   2) The overlap increases the size of submatrices beyond that of the
> original matrix. Suppose that you used LU for the sub-preconditioner.
>       You would need at least 2x memory (with ILU(0)) since the matrix
> dominates memory usage. Moreover, you have overlap
>       and you might have fill-in depending on the solver.
>
>   3) The massif tool from valgrind is a good fine-grained way to look at
> memory allocation
>
>     Thanks,
>
>        Matt
>
>   Is there a way to decrease amout of allocated memory?
>   Is that an expected behaviour for GMRES-ASM combination?
>
>   As I remember, using previous version of PETSc didn't demonstrate so
> significante memory increasing.
>
>   ...
>   Vec :: Vec_F, Vec_U
>   Mat :: Mat_K
>   ...
>   ...
>   call MatAssemblyBegin(Mat_M,Mat_Final_Assembly,ierr)
>   call MatAssemblyEnd(Mat_M,Mat_Final_Assembly,ierr)
>   ....
>   call VecAssemblyBegin(Vec_F_mod,ierr)
>   call VecAssemblyEnd(Vec_F_mod,ierr)
>   ...
>   ...
>   call PetscMallocGetCurrentUsage(mem, ierr)
>   print *,"Memory used: ",mem
>   ...
>   ...
>   call KSPSetType(Krylov,KSPGMRES,ierr)
>   call KSPGetPC(Krylov,PreCon,ierr)
>   call PCSetType(PreCon,PCASM,ierr)
>   call KSPSetFromOptions(Krylov,ierr)
>   ...
>   call KSPSolve(Krylov,Vec_F,Vec_U,ierr)
>   ...
>   ...
>   options = "-pc_asm_overlap 2 -pc_asm_type basic -ksp_monitor
> -ksp_converged_reason"
>
>
>   Kind regards,
>   Dmitry Melnichuk
>   Matrix.dat (265288024)
>
>
>   --
>   What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
>   -- Norbert Wiener
>
>   https://www.cse.buffalo.edu/~knepley/
>
>
>  <Logs_26K_GMRES-ASM-log_view-log_view_memory-malloc_dump_32bit>
>
>
>
>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20200205/8d737f4e/attachment.html>