[petsc-users] Error - Out of memory. This could be due to allocating too large an object or bleeding by not properly ...

Barry Smith bsmith at mcs.anl.gov
Tue Mar 1 23:00:56 CST 2016


> On Mar 1, 2016, at 10:19 PM, TAY wee-beng <zonexo at gmail.com> wrote:
> 
> 
> On 26/2/2016 9:21 PM, Barry Smith wrote:
>>> On Feb 26, 2016, at 1:14 AM, TAY wee-beng <zonexo at gmail.com>
>>>  wrote:
>>> 
>>> 
>>> On 26/2/2016 1:56 AM, Barry Smith wrote:
>>> 
>>>>    Run a much smaller problem for a few time steps, making sure you free all the objects at the end, with the option -malloc_dump this will print all the memory that was not freed and hopefully help you track down which objects you forgot to free.
>>>> 
>>>>   Barry
>>>> 
>>> Hi,
>>> 
>>> I run a smaller problem and lots of things are shown in the log. How can I know which exactly are not freed from the memory?
>>> 
>>    Everything in in the log represents unfreed memory. You need to hunt through all the objects you create and make sure you destroy all of them.
>> 
>>   Barry
>> 
> Hi,
> 
> I have some questions.
> 
> [0]Total space allocated 2274656 bytes
> [ 0]16 bytes PetscStrallocpy() line 188 in /home/wtay/Codes/petsc-3.6.3/src/sys/utils/str.c
> [ 0]624 bytes ISLocalToGlobalMappingCreate() line 270 in /home/wtay/Codes/petsc-3.6.3/src/vec/is/utils/isltog.c
> [ 0]16 bytes VecScatterCreateCommon_PtoS() line 2655 in /home/wtay/Codes/petsc-3.6.3/src/vec/vec/utils/vpscat.c
> [ 0]16 bytes VecScatterCreateCommon_PtoS() line 2654 in /home/wtay/Codes/petsc-3.6.3/src/vec/vec/utils/vpscat.c
> [ 0]1440 bytes VecScatterCreate_PtoS() line 2463 in /home/wtay/Codes/petsc-3.6.3/src/vec/vec/utils/vpscat.c
> [ 0]1440 bytes VecScatterCreate_PtoS() line 2462 in /home/wtay/Codes/
> 
> 1. What does the [0] means? I get from [0] to [23]?

  It is the MPI process reporting the memory usage
> 
> 2. I defined a variable globally:
> 
> DM  da_cu_types
> 
> Then I use at each time step:
> 
> call DMDACreate3d(MPI_COMM_WORLD,DM_BOUNDARY_NONE,DM_BOUNDARY_NONE,DM_BOUNDARY_NONE,DMDA_STENCIL_STAR,(IIB_I_end_domain(1) - IIB_I_sta_domain(1) + 1),(IIB_I_end_domain(2) - IIB_I_sta_domain(2) + 1),&
> 
> (IIB_I_end_domain(3) - IIB_I_sta_domain(3) + 1),PETSC_DECIDE,PETSC_DECIDE,PETSC_DECIDE,1,stencil_width_IIB,PETSC_NULL_INTEGER,PETSC_NULL_INTEGER,PETSC_NULL_INTEGER,da_cu_types,ierr)
> 
> call DMDAGetInfo(da_cu_types,PETSC_NULL_INTEGER,PETSC_NULL_INTEGER,PETSC_NULL_INTEGER,PETSC_NULL_INTEGER,num_procs_xyz_IIB(1),num_procs_xyz_IIB(2),num_procs_xyz_IIB(3),&
> 
> PETSC_NULL_INTEGER,PETSC_NULL_INTEGER,PETSC_NULL_INTEGER,PETSC_NULL_INTEGER,PETSC_NULL_INTEGER,PETSC_NULL_INTEGER,ierr)
> 
> call DMDAGetCorners(da_cu_types,start_ijk_IIB(1),start_ijk_IIB(2),start_ijk_IIB(3),width_ijk_IIB(1),width_ijk_IIB(2),width_ijk_IIB(3),ierr)
> 
> call DMDAGetGhostCorners(da_cu_types,start_ijk_ghost_IIB(1),start_ijk_ghost_IIB(2),start_ijk_ghost_IIB(3),width_ijk_ghost_IIB(1),width_ijk_ghost_IIB(2),width_ijk_ghost_IIB(3),ierr)
> 
> The purpose is just to get the starting and ending inidices for each cpu partition. This is done every time step for a moving body case since the IIB_I_sta_domain and IIB_I_end_domain changes
> 
> After getting all the info, must I call DMDestroy(da_cu_types,ierr)?

  Yes, otherwise you will get more and more DM taking up memory
> 
> Is it possible to update and use the new IIB_I_sta_domain and IIB_I_end_domain without the need to create and destroy the DM? I thought that may save some time since it's done at every time step.

  If the information you pass to the DMCreate changes then you need to create it again.

  Barry

> 
> Thanks
> 
>> 
>>> Is this info helpful? Or should I run in a single core?
>>> 
>>> Thanks
>>> 
>>>>> On Feb 25, 2016, at 12:33 AM, TAY wee-beng <zonexo at gmail.com>
>>>>>  wrote:
>>>>> 
>>>>> Hi,
>>>>> 
>>>>> I ran the code and it hangs again. However, adding -malloc_test doesn't seem to do any thing. The output (attached) is the same w/o it.
>>>>> 
>>>>> Wonder if there's anything else I can do.
>>>>> Thank you
>>>>> 
>>>>> Yours sincerely,
>>>>> 
>>>>> TAY wee-beng
>>>>> 
>>>>> On 24/2/2016 11:33 PM, Matthew Knepley wrote:
>>>>> 
>>>>>> On Wed, Feb 24, 2016 at 9:28 AM, TAY wee-beng <zonexo at gmail.com>
>>>>>>  wrote:
>>>>>> 
>>>>>> On 24/2/2016 11:18 PM, Matthew Knepley wrote:
>>>>>> 
>>>>>>> On Wed, Feb 24, 2016 at 9:16 AM, TAY wee-beng <zonexo at gmail.com>
>>>>>>>  wrote:
>>>>>>> 
>>>>>>> On 24/2/2016 9:12 PM, Matthew Knepley wrote:
>>>>>>> 
>>>>>>>> On Wed, Feb 24, 2016 at 1:54 AM, TAY wee-beng <zonexo at gmail.com>
>>>>>>>>  wrote:
>>>>>>>> 
>>>>>>>> On 24/2/2016 10:28 AM, Matthew Knepley wrote:
>>>>>>>> 
>>>>>>>>> On Tue, Feb 23, 2016 at 7:50 PM, TAY wee-beng <zonexo at gmail.com>
>>>>>>>>>  wrote:
>>>>>>>>> Hi,
>>>>>>>>> 
>>>>>>>>> I got this error (also attached, full) when running my code. It happens after a few thousand time steps.
>>>>>>>>> 
>>>>>>>>> The strange thing is that for 2 different clusters, it stops at 2 different time steps.
>>>>>>>>> 
>>>>>>>>> I wonder if it's related to DM since this happens after I added DM into my code.
>>>>>>>>> 
>>>>>>>>> In this case, how can I find out the error? I'm thinking valgrind may take very long and gives too many false errors.
>>>>>>>>> 
>>>>>>>>> It is very easy to find leaks. You just run a few steps with -malloc_dump and see what is left over.
>>>>>>>>> 
>>>>>>>>>    Matt
>>>>>>>>> 
>>>>>>>> Hi Matt,
>>>>>>>> 
>>>>>>>> Do you mean running my a.out with the -malloc_dump and stop after a few time steps?
>>>>>>>> 
>>>>>>>> What and how should I "see" then?
>>>>>>>> 
>>>>>>>> -malloc_dump outputs all unfreed memory to the screen after PetscFinalize(), so you should see the leak.
>>>>>>>> I guess it might be possible to keep creating things that you freed all at once at the end, but that is less likely.
>>>>>>>> 
>>>>>>>>    Matt
>>>>>>>>  
>>>>>>>> 
>>>>>>> Hi,
>>>>>>> 
>>>>>>> I got the output. I have zipped it since it's rather big. So it seems to be from DM routines but can you help me where the error is from?
>>>>>>> 
>>>>>>> Its really hard to tell by looking at it. What I do is remove things until there is no leak, then progressively
>>>>>>> put thing back in until I have the culprit. Then you can think about what is not destroyed.
>>>>>>> 
>>>>>>>   Matt
>>>>>>> 
>>>>>> Ok so let me get this clear. When it shows:
>>>>>> 
>>>>>> [21]Total space allocated 1728961264 bytes
>>>>>> [21]1861664 bytes MatCheckCompressedRow() line 60 in /home/wtay/Codes/petsc-3.6.3/src/mat/utils/compressedrow.c
>>>>>> [21]16 bytes PetscStrallocpy() line 188 in /home/wtay/Codes/petsc-3.6.3/src/sys/utils/str.c
>>>>>> [21]624 bytes ISLocalToGlobalMappingCreate() line 270 in /home/wtay/Codes
>>>>>> 
>>>>>> ....
>>>>>> 
>>>>>> Does it mean that it's simply allocating space ie normal? Or does it show that there's memory leak ie error?
>>>>>> 
>>>>>> I gave the wrong option. That dumps everything. Lets just look at the leaks with -malloc_test.
>>>>>> 
>>>>>>  Sorry about that,
>>>>>> 
>>>>>>     Matt
>>>>>>  If it's error, should I zoom in and debug around this time at this region?
>>>>>> 
>>>>>> Thanks
>>>>>> 
>>>>>>>  Thanks.
>>>>>>> 
>>>>>>>>  
>>>>>>>> 
>>>>>>>>> -- 
>>>>>>>>> Thank you
>>>>>>>>> 
>>>>>>>>> Yours sincerely,
>>>>>>>>> 
>>>>>>>>> TAY wee-beng
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> -- 
>>>>>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>>>>>>>>> -- Norbert Wiener
>>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> -- 
>>>>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>>>>>>>> -- Norbert Wiener
>>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> -- 
>>>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>>>>>>> -- Norbert Wiener
>>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> -- 
>>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>>>>>> -- Norbert Wiener
>>>>>> 
>>>>> <ibm2d.err>
>>>>> 
>>> <log.7z>
>>> 
> 



More information about the petsc-users mailing list