[petsc-users] Memory Used When Reading petscrc

Mon Nov 25 02:45:30 CST 2024

test_configuration_options.F90:l.55
max_msg_length is quite large.... I guess the pow() is a typo.
Cheers,
Fabian

On 11/25/24 09:32, David Scott wrote:
> I'll have a look at heaptrack.
> 
> The code that I am looking at the moment does not create a mesh. All it 
> does is read a petscrc file.
> 
> Thanks,
> 
> David
> 
> On 25/11/2024 05:27, Jed Brown wrote:
>> This email was sent to you by someone outside the University.
>> You should only click on links or attachments if you are certain that 
>> the email is genuine and the content is safe.
>>
>> You're clearly doing almost all your allocation *not* using 
>> PetscMalloc (so not in a Vec or Mat). If you're managing your own mesh 
>> yourself, you might be allocating a global amount on each rank, 
>> instead of strictly using scalable data structures (i.e., always 
>> partitioned).
>>
>> My favorite tool for understanding memory use is heaptrack.
>>
>> https://urldefense.us/v3/__https://github.com/KDE/heaptrack__;!!G_uCfscf7eWS!bM8Vs5Ljq0ZJOl_Zl88PpU1JJWw39UMiu50wgyt0zhG4ax6DxOvabmaDYbKrrCATTeWrKDmDR5C-3bDziLRcXp30NMQ$
>> David Scott <d.scott at epcc.ed.ac.uk> writes:
>>
>>> OK.
>>>
>>> I had started to wonder if that was the case. I'll do some further
>>> investigation.
>>>
>>> Thanks,
>>>
>>> David
>>>
>>> On 22/11/2024 22:10, Matthew Knepley wrote:
>>>> This email was sent to you by someone outside the University.
>>>> You should only click on links or attachments if you are certain that
>>>> the email is genuine and the content is safe.
>>>> On Fri, Nov 22, 2024 at 12:57 PM David Scott <d.scott at epcc.ed.ac.uk>
>>>> wrote:
>>>>
>>>>      Matt,
>>>>
>>>>      Thanks for the quick response.
>>>>
>>>>      Yes 1) is trivially true.
>>>>
>>>>      With regard to 2), from the SLURM output:
>>>>      [0] Maximum memory PetscMalloc()ed 29552 maximum size of entire
>>>>      process 4312375296
>>>>      [1] Maximum memory PetscMalloc()ed 29552 maximum size of entire
>>>>      process 4311990272
>>>>      Yes only 29KB was malloced but the total figure was 4GB per 
>>>> process.
>>>>
>>>>      Looking at
>>>>       mem0 =    16420864.000000000
>>>>       mem0 =    16117760.000000000
>>>>       mem1 =    4311490560.0000000
>>>>       mem1 =    4311826432.0000000
>>>>       mem2 =    4311490560.0000000
>>>>       mem2 =    4311826432.0000000
>>>>      mem0 is written after PetscInitialize.
>>>>      mem1 is written roughly half way through the options being read.
>>>>      mem2 is written on completion of the options being read.
>>>>
>>>>      The code does very little other than read configuration options.
>>>>      Why is so much memory used?
>>>>
>>>>
>>>> This is not due to options processing, as that would fall under Petsc
>>>> malloc allocations. I believe we are measuring this
>>>> using RSS which includes the binary, all shared libraries which are
>>>> paged in, and stack/heap allocations. I think you are
>>>> seeing the shared libraries come in. You might be able to see all the
>>>> libraries that come in using strace.
>>>>
>>>>    Thanks,
>>>>
>>>>       Matt
>>>>
>>>>      I do not understand what is going on and I may have expressed
>>>>      myself badly but I do have a problem as I certainly cannot use
>>>>      anywhere near 128 processes on a node with 128GB of RAM before I
>>>>      get an OOM error. (The code runs successfully on 32 processes but
>>>>      not 64.)
>>>>
>>>>      Regards,
>>>>
>>>>      David
>>>>
>>>>      On 22/11/2024 16:53, Matthew Knepley wrote:
>>>>>      This email was sent to you by someone outside the University.
>>>>>      You should only click on links or attachments if you are certain
>>>>>      that the email is genuine and the content is safe.
>>>>>      On Fri, Nov 22, 2024 at 11:36 AM David Scott
>>>>>      <d.scott at epcc.ed.ac.uk> wrote:
>>>>>
>>>>>          Hello,
>>>>>
>>>>>          I am using the options mechanism of PETSc to configure my CFD
>>>>>          code. I
>>>>>          have introduced options describing the size of the domain
>>>>>          etc. I have
>>>>>          noticed that this consumes a lot of memory. I have found that
>>>>>          the amount
>>>>>          of memory used scales linearly with the number of MPI
>>>>>          processes used.
>>>>>          This restricts the number of MPI processes that I can use.
>>>>>
>>>>>
>>>>>      There are two statements:
>>>>>
>>>>>      1) The memory scales linearly with P
>>>>>
>>>>>      2) This uses a lot of memory
>>>>>
>>>>>      Let's deal with 1) first. This seems to be trivially true. If I
>>>>>      want every process to have
>>>>>      access to a given option value, that option value must be in the
>>>>>      memory of every process.
>>>>>      The only alternative would be to communicate with some process in
>>>>>      order to get values.
>>>>>      Few codes seem to be willing to make this tradeoff, and we do not
>>>>>      offer it.
>>>>>
>>>>>      Now 2). Looking at the source, for each option we store
>>>>>      a PetscOptionItem, which I count
>>>>>      as having size 37 bytes (12 pointers/ints and a char). However,
>>>>>      there is data behind every
>>>>>      pointer, like the name, help text, available values (sometimes),
>>>>>      I could see it being as large
>>>>>      as 4K. Suppose it is. If I had 256 options, that would be 1M. Is
>>>>>      this a large amount of memory?
>>>>>
>>>>>      The way I read the SLURM output, 29K was malloced. Is this a
>>>>>      large amount of memory?
>>>>>
>>>>>      I am trying to get an idea of the scale.
>>>>>
>>>>>        Thanks,
>>>>>
>>>>>            Matt
>>>>>
>>>>>          Is there anything that I can do about this or do I need to
>>>>>          configure my
>>>>>          code in a different way?
>>>>>
>>>>>          I have attached some code extracted from my application which
>>>>>          demonstrates this along with the output from a running it on
>>>>>          2 MPI
>>>>>          processes.
>>>>>
>>>>>          Best wishes,
>>>>>
>>>>>          David Scott
>>>>>          The University of Edinburgh is a charitable body, registered
>>>>>          in Scotland, with registration number SC005336. Is e
>>>>>          buidheann carthannais a th’ ann an Oilthigh Dhùn Èideann,
>>>>>          clàraichte an Alba, àireamh clàraidh SC005336.
>>>>>
>>>>>
>>>>>
>>>>>      --
>>>>>      What most experimenters take for granted before they begin their
>>>>>      experiments is infinitely more interesting than any results to
>>>>>      which their experiments lead.
>>>>>      -- Norbert Wiener
>>>>>
>>>>>      
>>>>> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!cH8SjJvsuVEK1zv8noUjNUJC0VnHFqems68PjB2E94pqxc3q55YprX1q2JXFvPAzXJkh40J1-erXPWdIvc-xrLkRIgg$
>>>>>      
>>>>> <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!cH8SjJvsuVEK1zv8noUjNUJC0VnHFqems68PjB2E94pqxc3q55YprX1q2JXFvPAzXJkh40J1-erXPWdIvc-xGybRwKU$ >
>>>>
>>>>
>>>> -- 
>>>> What most experimenters take for granted before they begin their
>>>> experiments is infinitely more interesting than any results to which
>>>> their experiments lead.
>>>> -- Norbert Wiener
>>>>
>>>> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!cH8SjJvsuVEK1zv8noUjNUJC0VnHFqems68PjB2E94pqxc3q55YprX1q2JXFvPAzXJkh40J1-erXPWdIvc-xrLkRIgg$
>>>> <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!cH8SjJvsuVEK1zv8noUjNUJC0VnHFqems68PjB2E94pqxc3q55YprX1q2JXFvPAzXJkh40J1-erXPWdIvc-xGybRwKU$ >
>