[petsc-users] How to measure the memory usage of the application built on the Petsc?

Matthew Knepley knepley at gmail.com
Tue May 28 05:05:04 CDT 2013


On Tue, May 28, 2013 at 5:54 AM, Fande Kong <Fande.Kong at colorado.edu> wrote:

> Hi Smith,
>
> Thank you very much. According to your suggestions and information, I
> added these functions into my code to measure the memory usage. Now I am
> confused, since the small problem needs large memory.
>
> I added the function PetscMemorySetGetMaximumUsage()  immediately after
> PetscInitialize(). And then I added the following code into several
> positions in the code (before & after setting up unstructured mesh, before
> & after KSPSetUp(), before & after KSPSolve(), and Destroy all stuffs):
>
>    PetscLogDouble space =0;
>   ierr =  PetscMallocGetCurrentUsage(&space);CHKERRQ(ierr);
>   ierr =  PetscPrintf(comm,"Current space PetscMalloc()ed %G M\n",
> space/(1024*1024));CHKERRQ(ierr);
>   ierr =  PetscMallocGetMaximumUsage(&space);CHKERRQ(ierr);
>   ierr =  PetscPrintf(comm,"Max space PetscMalloced() %G M\n",
> space/(1024*1024));CHKERRQ(ierr);
>   ierr =  PetscMemoryGetCurrentUsage(&space);CHKERRQ(ierr);
>   ierr =  PetscPrintf(comm,"Current process memory %G M\n",
> space/(1024*1024));CHKERRQ(ierr);
>   ierr =  PetscMemoryGetMaximumUsage(&space);CHKERRQ(ierr);
>   ierr =  PetscPrintf(comm,"Max process memory %G M\n",
> space/(1024*1024));CHKERRQ(ierr);
>
>
> In order to measure the memory usage, I just used only one core (mpirun -n
> 1 ./program ) to solve a small problem with 12691 mesh nodes (the freedom
> is about 12691*3= 4 *10^4 ). I solve the linear elasticity problem by using
> FGMRES preconditioned by multigrid method (PCMG). I use all petsc standard
> routines except that I construct coarse matrix and interpolation matrix by
> myself. I used the following run script to set up solver and preconditioner:
>
> mpirun -n 1 ./linearElasticity  -ksp_type fgmres -pc_type mg
> -pc_mg_levels 2 -pc_mg_cycle_type v -pc_mg_type multiplicative
> -mg_levels_1_ksp_type richardson -mg_levels_1_ksp_max_it 1
> -mg_levels_1_pc_type asm -mg_levels_1_sub_ksp_type preonly
> -mg_levels_1_sub_pc_type ilu -mg_levels_1_sub_pc_factor_levels 4
> -mg_levels_1_sub_pc_factor_mat_ordering_type rcm -mg_coarse_ksp_type cg
> -mg_coarse_ksp_rtol 0.1  -mg_coarse_ksp_max_it 10 -mg_coarse_pc_type asm
> -mg_coarse_sub_ksp_type preonly -mg_coarse_sub_pc_type ilu
> -mg_coarse_sub_pc_factor_levels 2
> -mg_coarse_sub_pc_factor_mat_ordering_type rcm -ksp_view    -log_summary
> -pc_mg_log
>
>
>  I got the following results:
>
> (1) before setting up mesh,
>
> Current space PetscMalloc()ed 0.075882 M
> Max space PetscMalloced() 0.119675 M
> Current process memory 7.83203 M
> Max process memory 0 M
>
> (2) after setting up mesh,
>
> Current space PetscMalloc()ed 16.8411 M
> Max space PetscMalloced() 22.1353 M
> Current process memory 28.4336 M
> Max process memory 33.0547 M
>
> (3) before calling KSPSetUp()
>
> Current space PetscMalloc()ed 16.868 M
> Max space PetscMalloced() 22.1353 M
> Current process memory 28.6914 M
> Max process memory 33.0547 M
>
>
> (4) after calling KSPSetUp()
>
> Current space PetscMalloc()ed 74.3354 M
> Max space PetscMalloced() 74.3355 M
>

This makes sense. It is 20M for your mesh, 20M
for the Krylov space on the fine level, and I am guessing
35M for the Jacobian and the ILU factors.


> Current process memory 85.6953 M
> Max process memory 84.9258 M
>
> (5) before calling KSPSolve()
>
> Current space PetscMalloc()ed 74.3354 M
> Max space PetscMalloced() 74.3355 M
> Current process memory 85.8711 M
> Max process memory 84.9258 M
>
> (6) after calling KSPSolve()
>

The question is what was malloc'd here. There is no way we could
tell without seeing the code and probably running it. I suggest
using
http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Sys/PetscMallocDump.html
to see what was allocated. The solvers tend not to allocated during
the solve, as that is slow. So I would be inclined to check user code first.

   Matt


> Current space PetscMalloc()ed 290.952 M
> Max space PetscMalloced() 593.367 M
> Current process memory 306.852 M
> Max process memory 301.441 M
>
> (7) After destroying all stuffs
>
> Current space PetscMalloc()ed 0.331482 M
> Max space PetscMalloced() 593.367 M
> Current process memory 67.2539 M
> Max process memory 309.137 M
>
>
> So my question is why/if I need so much memory (306.852 M) for so small
> problem (freedom: 4*10^4). Or is it normal case? Or my run script used to
> set up solver is not reasonable?
>
>
> Regards,
>
> Fande Kong,
>
> Department of Computer Science
> University of Colorado Boulder
>
>
>
>
>
>
>
>
>
>
> On Mon, May 27, 2013 at 9:48 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>
>>
>>    There are several ways to monitor the memory usage. You can divide
>> them into two categories: those that monitor how much memory has been
>> malloced specifically by PETSc and how much is used totally be the process.
>>
>> PetscMallocGetCurrentUsage() and PetscMallocGetMaximumUsage() which only
>> work with the command line option -malloc provide how much PETSc has
>> malloced.
>>
>> PetscMemoryGetCurrentUsage() and PetscMemoryGetMaximumUsage() (call
>> PetscMemorySetGetMaximumUsage() immediately after PetscInitialize() for
>> this one to work) provide total memory usage.
>>
>> These are called on each process so use a MPI_Reduce() to gather the
>> total memory across all processes to process 0 to print it out. Suggest
>> calling it after the mesh as been set up, then call again immediately
>> before the XXXSolve() is called and then after the XXXSolve() is called.
>>
>>    Please let us know if you have any difficulties.
>>
>>     As always we recommend you upgrade to PETSc 3.4
>>
>>     Barry
>>
>>
>>
>> On May 27, 2013, at 10:22 PM, Fande Kong <fande.kong at colorado.edu> wrote:
>>
>> > Hi all,
>> >
>> > How to measure the memory usage of the application built on the Petsc?
>>  I am now solving linear elasticity equations with fgmres preconditioned by
>> two-level method, that is, preconditioned by multigrid method where on each
>> level the additive Schwarz method is adopted.  More than 1000 cores are
>> adopted to solve this problem on the supercomputer. When the total freedom
>> of the problem is about 60M, the application correctly run and produce
>> correct results. But when the total freedom increases to 600M, the
>> application abort and say there is not enough memory (  the system
>> administrator of the supercomputer told me that my application run out
>> memory).
>> >
>> > Thus, I want to monitor the memory usage dynamically when the
>> application running. Are there any functions or strategies that could be
>> used for this purpose?
>> >
>> > The error information is attached.
>> >
>> > Regards,
>> > --
>> > Fande Kong
>> > Department of Computer Science
>> > University of Colorado at Boulder
>> > <solid3dcube2.o1603352><configure and make log.zip>
>>
>>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20130528/1d098d19/attachment-0001.html>


More information about the petsc-users mailing list