[petsc-users] Memory Used When Reading petscrc

Mon Nov 25 02:32:19 CST 2024

I'll have a look at heaptrack.

The code that I am looking at the moment does not create a mesh. All it 
does is read a petscrc file.

Thanks,

David

On 25/11/2024 05:27, Jed Brown wrote:
> This email was sent to you by someone outside the University.
> You should only click on links or attachments if you are certain that the email is genuine and the content is safe.
>
> You're clearly doing almost all your allocation *not* using PetscMalloc (so not in a Vec or Mat). If you're managing your own mesh yourself, you might be allocating a global amount on each rank, instead of strictly using scalable data structures (i.e., always partitioned).
>
> My favorite tool for understanding memory use is heaptrack.
>
> https://urldefense.us/v3/__https://github.com/KDE/heaptrack__;!!G_uCfscf7eWS!bM8Vs5Ljq0ZJOl_Zl88PpU1JJWw39UMiu50wgyt0zhG4ax6DxOvabmaDYbKrrCATTeWrKDmDR5C-3bDziLRcXp30NMQ$ 
>
> David Scott <d.scott at epcc.ed.ac.uk> writes:
>
>> OK.
>>
>> I had started to wonder if that was the case. I'll do some further
>> investigation.
>>
>> Thanks,
>>
>> David
>>
>> On 22/11/2024 22:10, Matthew Knepley wrote:
>>> This email was sent to you by someone outside the University.
>>> You should only click on links or attachments if you are certain that
>>> the email is genuine and the content is safe.
>>> On Fri, Nov 22, 2024 at 12:57 PM David Scott <d.scott at epcc.ed.ac.uk>
>>> wrote:
>>>
>>>      Matt,
>>>
>>>      Thanks for the quick response.
>>>
>>>      Yes 1) is trivially true.
>>>
>>>      With regard to 2), from the SLURM output:
>>>      [0] Maximum memory PetscMalloc()ed 29552 maximum size of entire
>>>      process 4312375296
>>>      [1] Maximum memory PetscMalloc()ed 29552 maximum size of entire
>>>      process 4311990272
>>>      Yes only 29KB was malloced but the total figure was 4GB per process.
>>>
>>>      Looking at
>>>       mem0 =    16420864.000000000
>>>       mem0 =    16117760.000000000
>>>       mem1 =    4311490560.0000000
>>>       mem1 =    4311826432.0000000
>>>       mem2 =    4311490560.0000000
>>>       mem2 =    4311826432.0000000
>>>      mem0 is written after PetscInitialize.
>>>      mem1 is written roughly half way through the options being read.
>>>      mem2 is written on completion of the options being read.
>>>
>>>      The code does very little other than read configuration options.
>>>      Why is so much memory used?
>>>
>>>
>>> This is not due to options processing, as that would fall under Petsc
>>> malloc allocations. I believe we are measuring this
>>> using RSS which includes the binary, all shared libraries which are
>>> paged in, and stack/heap allocations. I think you are
>>> seeing the shared libraries come in. You might be able to see all the
>>> libraries that come in using strace.
>>>
>>>    Thanks,
>>>
>>>       Matt
>>>
>>>      I do not understand what is going on and I may have expressed
>>>      myself badly but I do have a problem as I certainly cannot use
>>>      anywhere near 128 processes on a node with 128GB of RAM before I
>>>      get an OOM error. (The code runs successfully on 32 processes but
>>>      not 64.)
>>>
>>>      Regards,
>>>
>>>      David
>>>
>>>      On 22/11/2024 16:53, Matthew Knepley wrote:
>>>>      This email was sent to you by someone outside the University.
>>>>      You should only click on links or attachments if you are certain
>>>>      that the email is genuine and the content is safe.
>>>>      On Fri, Nov 22, 2024 at 11:36 AM David Scott
>>>>      <d.scott at epcc.ed.ac.uk> wrote:
>>>>
>>>>          Hello,
>>>>
>>>>          I am using the options mechanism of PETSc to configure my CFD
>>>>          code. I
>>>>          have introduced options describing the size of the domain
>>>>          etc. I have
>>>>          noticed that this consumes a lot of memory. I have found that
>>>>          the amount
>>>>          of memory used scales linearly with the number of MPI
>>>>          processes used.
>>>>          This restricts the number of MPI processes that I can use.
>>>>
>>>>
>>>>      There are two statements:
>>>>
>>>>      1) The memory scales linearly with P
>>>>
>>>>      2) This uses a lot of memory
>>>>
>>>>      Let's deal with 1) first. This seems to be trivially true. If I
>>>>      want every process to have
>>>>      access to a given option value, that option value must be in the
>>>>      memory of every process.
>>>>      The only alternative would be to communicate with some process in
>>>>      order to get values.
>>>>      Few codes seem to be willing to make this tradeoff, and we do not
>>>>      offer it.
>>>>
>>>>      Now 2). Looking at the source, for each option we store
>>>>      a PetscOptionItem, which I count
>>>>      as having size 37 bytes (12 pointers/ints and a char). However,
>>>>      there is data behind every
>>>>      pointer, like the name, help text, available values (sometimes),
>>>>      I could see it being as large
>>>>      as 4K. Suppose it is. If I had 256 options, that would be 1M. Is
>>>>      this a large amount of memory?
>>>>
>>>>      The way I read the SLURM output, 29K was malloced. Is this a
>>>>      large amount of memory?
>>>>
>>>>      I am trying to get an idea of the scale.
>>>>
>>>>        Thanks,
>>>>
>>>>            Matt
>>>>
>>>>          Is there anything that I can do about this or do I need to
>>>>          configure my
>>>>          code in a different way?
>>>>
>>>>          I have attached some code extracted from my application which
>>>>          demonstrates this along with the output from a running it on
>>>>          2 MPI
>>>>          processes.
>>>>
>>>>          Best wishes,
>>>>
>>>>          David Scott
>>>>          The University of Edinburgh is a charitable body, registered
>>>>          in Scotland, with registration number SC005336. Is e
>>>>          buidheann carthannais a th’ ann an Oilthigh Dhùn Èideann,
>>>>          clàraichte an Alba, àireamh clàraidh SC005336.
>>>>
>>>>
>>>>
>>>>      --
>>>>      What most experimenters take for granted before they begin their
>>>>      experiments is infinitely more interesting than any results to
>>>>      which their experiments lead.
>>>>      -- Norbert Wiener
>>>>
>>>>      https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!cH8SjJvsuVEK1zv8noUjNUJC0VnHFqems68PjB2E94pqxc3q55YprX1q2JXFvPAzXJkh40J1-erXPWdIvc-xrLkRIgg$
>>>>      <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!cH8SjJvsuVEK1zv8noUjNUJC0VnHFqems68PjB2E94pqxc3q55YprX1q2JXFvPAzXJkh40J1-erXPWdIvc-xGybRwKU$ >
>>>
>>>
>>> --
>>> What most experimenters take for granted before they begin their
>>> experiments is infinitely more interesting than any results to which
>>> their experiments lead.
>>> -- Norbert Wiener
>>>
>>> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!cH8SjJvsuVEK1zv8noUjNUJC0VnHFqems68PjB2E94pqxc3q55YprX1q2JXFvPAzXJkh40J1-erXPWdIvc-xrLkRIgg$
>>> <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!cH8SjJvsuVEK1zv8noUjNUJC0VnHFqems68PjB2E94pqxc3q55YprX1q2JXFvPAzXJkh40J1-erXPWdIvc-xGybRwKU$ >