[petsc-users] Memory Used When Reading petscrc

Sun Nov 24 23:27:45 CST 2024

You're clearly doing almost all your allocation *not* using PetscMalloc (so not in a Vec or Mat). If you're managing your own mesh yourself, you might be allocating a global amount on each rank, instead of strictly using scalable data structures (i.e., always partitioned).

My favorite tool for understanding memory use is heaptrack.

https://urldefense.us/v3/__https://github.com/KDE/heaptrack__;!!G_uCfscf7eWS!agSDvRnjou_irVa09mE8tn11M8EkGEsPjrHe8yzMxmZyJkn-U6e0AxubboUT6qOgDuK4nIlW9w1Xr4TxxNk$ 

David Scott <d.scott at epcc.ed.ac.uk> writes:

> OK.
>
> I had started to wonder if that was the case. I'll do some further 
> investigation.
>
> Thanks,
>
> David
>
> On 22/11/2024 22:10, Matthew Knepley wrote:
>> This email was sent to you by someone outside the University.
>> You should only click on links or attachments if you are certain that 
>> the email is genuine and the content is safe.
>> On Fri, Nov 22, 2024 at 12:57 PM David Scott <d.scott at epcc.ed.ac.uk> 
>> wrote:
>>
>>     Matt,
>>
>>     Thanks for the quick response.
>>
>>     Yes 1) is trivially true.
>>
>>     With regard to 2), from the SLURM output:
>>     [0] Maximum memory PetscMalloc()ed 29552 maximum size of entire
>>     process 4312375296
>>     [1] Maximum memory PetscMalloc()ed 29552 maximum size of entire
>>     process 4311990272
>>     Yes only 29KB was malloced but the total figure was 4GB per process.
>>
>>     Looking at
>>      mem0 =    16420864.000000000
>>      mem0 =    16117760.000000000
>>      mem1 =    4311490560.0000000
>>      mem1 =    4311826432.0000000
>>      mem2 =    4311490560.0000000
>>      mem2 =    4311826432.0000000
>>     mem0 is written after PetscInitialize.
>>     mem1 is written roughly half way through the options being read.
>>     mem2 is written on completion of the options being read.
>>
>>     The code does very little other than read configuration options.
>>     Why is so much memory used?
>>
>>
>> This is not due to options processing, as that would fall under Petsc 
>> malloc allocations. I believe we are measuring this
>> using RSS which includes the binary, all shared libraries which are 
>> paged in, and stack/heap allocations. I think you are
>> seeing the shared libraries come in. You might be able to see all the 
>> libraries that come in using strace.
>>
>>   Thanks,
>>
>>      Matt
>>
>>     I do not understand what is going on and I may have expressed
>>     myself badly but I do have a problem as I certainly cannot use
>>     anywhere near 128 processes on a node with 128GB of RAM before I
>>     get an OOM error. (The code runs successfully on 32 processes but
>>     not 64.)
>>
>>     Regards,
>>
>>     David
>>
>>     On 22/11/2024 16:53, Matthew Knepley wrote:
>>>     This email was sent to you by someone outside the University.
>>>     You should only click on links or attachments if you are certain
>>>     that the email is genuine and the content is safe.
>>>     On Fri, Nov 22, 2024 at 11:36 AM David Scott
>>>     <d.scott at epcc.ed.ac.uk> wrote:
>>>
>>>         Hello,
>>>
>>>         I am using the options mechanism of PETSc to configure my CFD
>>>         code. I
>>>         have introduced options describing the size of the domain
>>>         etc. I have
>>>         noticed that this consumes a lot of memory. I have found that
>>>         the amount
>>>         of memory used scales linearly with the number of MPI
>>>         processes used.
>>>         This restricts the number of MPI processes that I can use.
>>>
>>>
>>>     There are two statements:
>>>
>>>     1) The memory scales linearly with P
>>>
>>>     2) This uses a lot of memory
>>>
>>>     Let's deal with 1) first. This seems to be trivially true. If I
>>>     want every process to have
>>>     access to a given option value, that option value must be in the
>>>     memory of every process.
>>>     The only alternative would be to communicate with some process in
>>>     order to get values.
>>>     Few codes seem to be willing to make this tradeoff, and we do not
>>>     offer it.
>>>
>>>     Now 2). Looking at the source, for each option we store
>>>     a PetscOptionItem, which I count
>>>     as having size 37 bytes (12 pointers/ints and a char). However,
>>>     there is data behind every
>>>     pointer, like the name, help text, available values (sometimes),
>>>     I could see it being as large
>>>     as 4K. Suppose it is. If I had 256 options, that would be 1M. Is
>>>     this a large amount of memory?
>>>
>>>     The way I read the SLURM output, 29K was malloced. Is this a
>>>     large amount of memory?
>>>
>>>     I am trying to get an idea of the scale.
>>>
>>>       Thanks,
>>>
>>>           Matt
>>>
>>>         Is there anything that I can do about this or do I need to
>>>         configure my
>>>         code in a different way?
>>>
>>>         I have attached some code extracted from my application which
>>>         demonstrates this along with the output from a running it on
>>>         2 MPI
>>>         processes.
>>>
>>>         Best wishes,
>>>
>>>         David Scott
>>>         The University of Edinburgh is a charitable body, registered
>>>         in Scotland, with registration number SC005336. Is e
>>>         buidheann carthannais a th’ ann an Oilthigh Dhùn Èideann,
>>>         clàraichte an Alba, àireamh clàraidh SC005336.
>>>
>>>
>>>
>>>     -- 
>>>     What most experimenters take for granted before they begin their
>>>     experiments is infinitely more interesting than any results to
>>>     which their experiments lead.
>>>     -- Norbert Wiener
>>>
>>>     https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!cH8SjJvsuVEK1zv8noUjNUJC0VnHFqems68PjB2E94pqxc3q55YprX1q2JXFvPAzXJkh40J1-erXPWdIvc-xrLkRIgg$ 
>>>     <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!cH8SjJvsuVEK1zv8noUjNUJC0VnHFqems68PjB2E94pqxc3q55YprX1q2JXFvPAzXJkh40J1-erXPWdIvc-xGybRwKU$ >
>>
>>
>>
>> -- 
>> What most experimenters take for granted before they begin their 
>> experiments is infinitely more interesting than any results to which 
>> their experiments lead.
>> -- Norbert Wiener
>>
>> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!cH8SjJvsuVEK1zv8noUjNUJC0VnHFqems68PjB2E94pqxc3q55YprX1q2JXFvPAzXJkh40J1-erXPWdIvc-xrLkRIgg$  
>> <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!cH8SjJvsuVEK1zv8noUjNUJC0VnHFqems68PjB2E94pqxc3q55YprX1q2JXFvPAzXJkh40J1-erXPWdIvc-xGybRwKU$ >