[petsc-users] Strange Memory Increase in PCHYPRE BOOMERAMG
Smith, Barry F.
bsmith at mcs.anl.gov
Wed Dec 18 11:00:06 CST 2019
Thank you for the full and detailed report. The memory leak could be anywhere but my guess is it is in the interface between PETSc and Hyper.
The first thing to check is if PETSc memory keeps increasing. The simplest way to do this is run your code 3 independent times with -malloc_debug -log_view and say 5 steps, then 10 steps, then 15 steps. Now look at the information at the end of the -log_view report. It shows how many PETSc objects are created and how much memory they use. Are the numbers getting larger?
The -malloc_debug flag will cause PETSc to print at the end of the run all the memory it has allocated but not freed. If your code is properly freeing everything it will print nothing indicating PETSc has no object leaps in your code.
Let us know how it goes and we can direct you how to debug memory leaks between PETSc and hypre or in hypre if that is where they are.
Barry
> On Dec 18, 2019, at 1:32 AM, Mohammed Mostafa <mo7ammedmostafa at gmail.com> wrote:
>
> Hello everyone,
> I seem to have a problem of increasing memory size with the linear solver. I double check every single part of my code and I isolated the problem to KSPSolve command.
> A bit of context/background regarding my algorithm.
> I am writing a finite volume numerical solver for incompressible two phase flow. I am mainly solving liquid sloshing and waves impact problems. Until recently most of simulations were 2D with max cell count of 100K cells but now I am considering solving 3D problems with cell counts between 6M~10M cells.
> The algorithm is as follows:
>
> setup ksp
> setup pc
>
> loop 1 → number of timesteps
> 1- // do some work
> 2- create the pressure poisson matrix
>
> 3 – loop: iloop 1 to 4
> 4-1- solve the matrix if ( iloop == 1) update the precond
> 4-2- // do some work
> 4-3- update rhs(no matrix or precond update)
> 4- do the next timestep and go back to step 1
>
> From many online posts many people recommended KSPCG with PCHYPRE with the following options
>
> // PC Setup
> PCCreate(PETSC_COMM_WORLD, &preconditioner);
> PCSetType(preconditioner, PCHYPRE);
> PCHYPRESetType(preconditioner, "boomeramg");
>
> PetscOptionsSetValue(NULL, "-pc_hypre_boomeramg_strong_threshold", "0.4");
> PetscOptionsSetValue(NULL, "-pc_hypre_boomeramg_agg_nl", "3");
> PetscOptionsSetValue(NULL, "-pc_hypre_boomeramg_agg_num_paths", "4");
>
> PetscOptionsSetValue(NULL, "-pc_hypre_boomeramg_interp_type", "ext+i");
> PetscOptionsSetValue(NULL, "-pc_hypre_boomeramg_truncfactor", "0.3");
> PetscOptionsSetValue(NULL, "-pc_hypre_boomeramg_max_iter", "1");
> PetscOptionsSetValue(NULL, "-pc_hypre_boomeramg_relax_type_all",
> "SOR/Jacobi");
> PetscOptionsSetValue(NULL, "-pc_hypre_boomeramg_grid_sweeps_down", "1");
> PetscOptionsSetValue(NULL, "-pc_hypre_boomeramg_max_iter", "1");
> PetscOptionsSetValue(NULL, "-pc_hypre_boomeramg_rtol", "0");
> KSPSetPC(iterSolver, preconditioner);
> KSPSetOperators(iterSolver, A, A);
> // KSP Setup
> KSPSetType(iterSolver, KSPCG);
> KSPSetInitialGuessNonzero(iterSolver, PETSC_TRUE);
> KSPSetNormType(iterSolver, KSP_NORM_UNPRECONDITIONED);
> KSPSetFromOptions(iterSolver);
>
> PCSetUp(preconditioner);
> KSPSetUp(iterSolver);
> KSPSetTolerances(iterSolver, 1e-18 /*reltol*/, 1e-12 /*abstol*/,
> 1e5 /*divergencetol*/, 50);
>
> When solving the linear system for the first time during the timestep (iloop =1), the following commands are used so that the preconditioner is reconstructed based on the new matrix values.
>
> KSPSetReusePreconditioner(iterSolver, PETSC_FALSE);
> KSPSetOperators(iterSolver, A, A); // set the new matrix
> KSPSolve(iterSolver, *RHS, *x);
> Otherwise
> KSPSetReusePreconditioner(iterSolver, PETSC_TRUE);
> KSPSolve(iterSolver, *RHS, *x);
> It should be noted that matrix non-zero structure remains the same throught the simulation but the values change every timestep.
> Now My problem is that memory size keep changing throught the simulation utill my workstation evetually runs out of memory.
> I tracked down the memory increase to the step which solve the linear system of equations.
> KSPSetReusePreconditioner(iterSolver, PETSC_FALSE);
> KSPSetOperators(iterSolver, A, A); // set the new matrix
> KSPSolve(iterSolver, *RHS, *x);
> My current problem is a sloshing problem with cell count of 1.2M cells running on workstation with 40 Cores and 94GB of RAM.
> The memory increase I observed did not occur every single time step but more like every 3-10 time step. The increase each time is somewhere from 0.51MB~3.0MB per mpi process and not not all process show memory change. For example in time step id 14 , proc0, proc6, proc13, proc24 show change but the others don’t. On average 10~15 processes show memory change while the others do not.
> Over 1000 time steps the solver use up additional 10GB of memory and still increasing until the system runs out of memory after nearly 10K time steps.
> I wanted to try another preconditioner PCJACOBI which ultimately failed due to max number of iterations but at least no memory change was observed over 50 time steps.
> Since I am using multigrid I expected the increase in memory required by the linear solver, I just assumed that after enough number of timesteps it would settle on some value. But it just kept increasing util finally the system ran out of memory.
> I monitored the memory usage of other parts/components of the code and no memory change occurs at all except at code section above.
> I configure PETSC with following options
> PETSC_ARCH=release3 -with-debugging=0 COPTFLAGS="-O3 -march=native -mtune=native" CXXOPTFLAGS="-O3 -march=native -mtune=native" FOPTFLAGS="-O3 -march=native -mtune=native" --with-cc=mpicc --w
> ith-cxx=mpicxx --with-fc=mpif90 --download-metis --download-hypre
>
> I tried configuring it in debug and compiling my code also in debug mode and running as the FAQ suggests but the code ended up being too slow so I shut it down after running it for 50mins and not even one timestep was completed.
>
> So to track down the memory leak/increase I used
> PetscMemoryGetCurrentUsage(&mem1);
> // some code
> PetscMemoryGetCurrentUsage(&mem2);
> memChnge = mem2 – mem1;
>
> I believe it could be one of the following possibilities:
> • A memory leak (bug) in the petsc library ( unlikely otherwise someone would have found it by now )
> • Some thing to do with setting of the boomerAMG Hypre preconditioner which makes it unnecessarily add additional levels. The settings I am using, I got from an online post I found so I don’t really understand it.
> https://mooseframework.inl.gov/application_development/hypre.html
> • Some problem with the sequence of commands I am using for the solution of linear system of eqns that makes allocate additional memory with freeing the old one.
> Thanks
>
More information about the petsc-users
mailing list