[petsc-users] Error - Out of memory. This could be due to allocating too large an object or bleeding by not properly ...
Barry Smith
bsmith at mcs.anl.gov
Tue Mar 1 23:00:56 CST 2016
> On Mar 1, 2016, at 10:19 PM, TAY wee-beng <zonexo at gmail.com> wrote:
>
>
> On 26/2/2016 9:21 PM, Barry Smith wrote:
>>> On Feb 26, 2016, at 1:14 AM, TAY wee-beng <zonexo at gmail.com>
>>> wrote:
>>>
>>>
>>> On 26/2/2016 1:56 AM, Barry Smith wrote:
>>>
>>>> Run a much smaller problem for a few time steps, making sure you free all the objects at the end, with the option -malloc_dump this will print all the memory that was not freed and hopefully help you track down which objects you forgot to free.
>>>>
>>>> Barry
>>>>
>>> Hi,
>>>
>>> I run a smaller problem and lots of things are shown in the log. How can I know which exactly are not freed from the memory?
>>>
>> Everything in in the log represents unfreed memory. You need to hunt through all the objects you create and make sure you destroy all of them.
>>
>> Barry
>>
> Hi,
>
> I have some questions.
>
> [0]Total space allocated 2274656 bytes
> [ 0]16 bytes PetscStrallocpy() line 188 in /home/wtay/Codes/petsc-3.6.3/src/sys/utils/str.c
> [ 0]624 bytes ISLocalToGlobalMappingCreate() line 270 in /home/wtay/Codes/petsc-3.6.3/src/vec/is/utils/isltog.c
> [ 0]16 bytes VecScatterCreateCommon_PtoS() line 2655 in /home/wtay/Codes/petsc-3.6.3/src/vec/vec/utils/vpscat.c
> [ 0]16 bytes VecScatterCreateCommon_PtoS() line 2654 in /home/wtay/Codes/petsc-3.6.3/src/vec/vec/utils/vpscat.c
> [ 0]1440 bytes VecScatterCreate_PtoS() line 2463 in /home/wtay/Codes/petsc-3.6.3/src/vec/vec/utils/vpscat.c
> [ 0]1440 bytes VecScatterCreate_PtoS() line 2462 in /home/wtay/Codes/
>
> 1. What does the [0] means? I get from [0] to [23]?
It is the MPI process reporting the memory usage
>
> 2. I defined a variable globally:
>
> DM da_cu_types
>
> Then I use at each time step:
>
> call DMDACreate3d(MPI_COMM_WORLD,DM_BOUNDARY_NONE,DM_BOUNDARY_NONE,DM_BOUNDARY_NONE,DMDA_STENCIL_STAR,(IIB_I_end_domain(1) - IIB_I_sta_domain(1) + 1),(IIB_I_end_domain(2) - IIB_I_sta_domain(2) + 1),&
>
> (IIB_I_end_domain(3) - IIB_I_sta_domain(3) + 1),PETSC_DECIDE,PETSC_DECIDE,PETSC_DECIDE,1,stencil_width_IIB,PETSC_NULL_INTEGER,PETSC_NULL_INTEGER,PETSC_NULL_INTEGER,da_cu_types,ierr)
>
> call DMDAGetInfo(da_cu_types,PETSC_NULL_INTEGER,PETSC_NULL_INTEGER,PETSC_NULL_INTEGER,PETSC_NULL_INTEGER,num_procs_xyz_IIB(1),num_procs_xyz_IIB(2),num_procs_xyz_IIB(3),&
>
> PETSC_NULL_INTEGER,PETSC_NULL_INTEGER,PETSC_NULL_INTEGER,PETSC_NULL_INTEGER,PETSC_NULL_INTEGER,PETSC_NULL_INTEGER,ierr)
>
> call DMDAGetCorners(da_cu_types,start_ijk_IIB(1),start_ijk_IIB(2),start_ijk_IIB(3),width_ijk_IIB(1),width_ijk_IIB(2),width_ijk_IIB(3),ierr)
>
> call DMDAGetGhostCorners(da_cu_types,start_ijk_ghost_IIB(1),start_ijk_ghost_IIB(2),start_ijk_ghost_IIB(3),width_ijk_ghost_IIB(1),width_ijk_ghost_IIB(2),width_ijk_ghost_IIB(3),ierr)
>
> The purpose is just to get the starting and ending inidices for each cpu partition. This is done every time step for a moving body case since the IIB_I_sta_domain and IIB_I_end_domain changes
>
> After getting all the info, must I call DMDestroy(da_cu_types,ierr)?
Yes, otherwise you will get more and more DM taking up memory
>
> Is it possible to update and use the new IIB_I_sta_domain and IIB_I_end_domain without the need to create and destroy the DM? I thought that may save some time since it's done at every time step.
If the information you pass to the DMCreate changes then you need to create it again.
Barry
>
> Thanks
>
>>
>>> Is this info helpful? Or should I run in a single core?
>>>
>>> Thanks
>>>
>>>>> On Feb 25, 2016, at 12:33 AM, TAY wee-beng <zonexo at gmail.com>
>>>>> wrote:
>>>>>
>>>>> Hi,
>>>>>
>>>>> I ran the code and it hangs again. However, adding -malloc_test doesn't seem to do any thing. The output (attached) is the same w/o it.
>>>>>
>>>>> Wonder if there's anything else I can do.
>>>>> Thank you
>>>>>
>>>>> Yours sincerely,
>>>>>
>>>>> TAY wee-beng
>>>>>
>>>>> On 24/2/2016 11:33 PM, Matthew Knepley wrote:
>>>>>
>>>>>> On Wed, Feb 24, 2016 at 9:28 AM, TAY wee-beng <zonexo at gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>> On 24/2/2016 11:18 PM, Matthew Knepley wrote:
>>>>>>
>>>>>>> On Wed, Feb 24, 2016 at 9:16 AM, TAY wee-beng <zonexo at gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>> On 24/2/2016 9:12 PM, Matthew Knepley wrote:
>>>>>>>
>>>>>>>> On Wed, Feb 24, 2016 at 1:54 AM, TAY wee-beng <zonexo at gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>> On 24/2/2016 10:28 AM, Matthew Knepley wrote:
>>>>>>>>
>>>>>>>>> On Tue, Feb 23, 2016 at 7:50 PM, TAY wee-beng <zonexo at gmail.com>
>>>>>>>>> wrote:
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> I got this error (also attached, full) when running my code. It happens after a few thousand time steps.
>>>>>>>>>
>>>>>>>>> The strange thing is that for 2 different clusters, it stops at 2 different time steps.
>>>>>>>>>
>>>>>>>>> I wonder if it's related to DM since this happens after I added DM into my code.
>>>>>>>>>
>>>>>>>>> In this case, how can I find out the error? I'm thinking valgrind may take very long and gives too many false errors.
>>>>>>>>>
>>>>>>>>> It is very easy to find leaks. You just run a few steps with -malloc_dump and see what is left over.
>>>>>>>>>
>>>>>>>>> Matt
>>>>>>>>>
>>>>>>>> Hi Matt,
>>>>>>>>
>>>>>>>> Do you mean running my a.out with the -malloc_dump and stop after a few time steps?
>>>>>>>>
>>>>>>>> What and how should I "see" then?
>>>>>>>>
>>>>>>>> -malloc_dump outputs all unfreed memory to the screen after PetscFinalize(), so you should see the leak.
>>>>>>>> I guess it might be possible to keep creating things that you freed all at once at the end, but that is less likely.
>>>>>>>>
>>>>>>>> Matt
>>>>>>>>
>>>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> I got the output. I have zipped it since it's rather big. So it seems to be from DM routines but can you help me where the error is from?
>>>>>>>
>>>>>>> Its really hard to tell by looking at it. What I do is remove things until there is no leak, then progressively
>>>>>>> put thing back in until I have the culprit. Then you can think about what is not destroyed.
>>>>>>>
>>>>>>> Matt
>>>>>>>
>>>>>> Ok so let me get this clear. When it shows:
>>>>>>
>>>>>> [21]Total space allocated 1728961264 bytes
>>>>>> [21]1861664 bytes MatCheckCompressedRow() line 60 in /home/wtay/Codes/petsc-3.6.3/src/mat/utils/compressedrow.c
>>>>>> [21]16 bytes PetscStrallocpy() line 188 in /home/wtay/Codes/petsc-3.6.3/src/sys/utils/str.c
>>>>>> [21]624 bytes ISLocalToGlobalMappingCreate() line 270 in /home/wtay/Codes
>>>>>>
>>>>>> ....
>>>>>>
>>>>>> Does it mean that it's simply allocating space ie normal? Or does it show that there's memory leak ie error?
>>>>>>
>>>>>> I gave the wrong option. That dumps everything. Lets just look at the leaks with -malloc_test.
>>>>>>
>>>>>> Sorry about that,
>>>>>>
>>>>>> Matt
>>>>>> If it's error, should I zoom in and debug around this time at this region?
>>>>>>
>>>>>> Thanks
>>>>>>
>>>>>>> Thanks.
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Thank you
>>>>>>>>>
>>>>>>>>> Yours sincerely,
>>>>>>>>>
>>>>>>>>> TAY wee-beng
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>>>>>>>>> -- Norbert Wiener
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>>>>>>>> -- Norbert Wiener
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>>>>>>> -- Norbert Wiener
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>>>>>> -- Norbert Wiener
>>>>>>
>>>>> <ibm2d.err>
>>>>>
>>> <log.7z>
>>>
>
More information about the petsc-users
mailing list