[petsc-users] memory use of a DMDA
Matthew Knepley
knepley at gmail.com
Mon Oct 21 15:46:09 CDT 2013
On Mon, Oct 21, 2013 at 3:23 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>
> Matt,
>
> I think you are running on 1 process where the DMDA doesn't have an
> optimized path, when I run on 2 processes the numbers indicate nothing
> proportional to dof* number of local points
>
Yes, I figured if it was not doing the right thing on 1, why go to more? :)
Matt
> dof = 12
> ~/Src/petsc/test master $ petscmpiexec -n 2 ./ex1 -malloc_log | grep
> VecScatter
> [0] 7 21344 VecScatterCreate()
> [0] 2 32 VecScatterCreateCommon_PtoS()
> [0] 39 182480 VecScatterCreate_PtoS()
>
> dof = 8
> ~/Src/petsc/test master $ petscmpiexec -n 2 ./ex1 -malloc_log | grep
> VecScatter
> [0] 7 21344 VecScatterCreate()
> [0] 2 32 VecScatterCreateCommon_PtoS()
> [0] 39 176080 VecScatterCreate_PtoS()
>
> dof = 4
>
> ~/Src/petsc/test master $ petscmpiexec -n 2 ./ex1 -malloc_log | grep
> VecScatter
> [0] 7 21344 VecScatterCreate()
> [0] 2 32 VecScatterCreateCommon_PtoS()
> [0] 39 169680 VecScatterCreate_PtoS()
>
> dof = 2
> ~/Src/petsc/test master $ petscmpiexec -n 2 ./ex1 -malloc_log | grep
> VecScatter
> [0] 7 21344 VecScatterCreate()
> [0] 2 32 VecScatterCreateCommon_PtoS()
> [0] 39 166480 VecScatterCreate_PtoS()
>
> dof =2 grid is 50 by 50 instead of 100 by 100
>
> ~/Src/petsc/test master $ petscmpiexec -n 2 ./ex1 -malloc_log | grep
> VecScatter
> [0] 7 6352 VecScatterCreate()
> [0] 2 32 VecScatterCreateCommon_PtoS()
> [0] 39 43952 VecScatterCreate_PtoS()
>
> The IS creation in the DMDA is far more troubling
>
> /Src/petsc/test master $ petscmpiexec -n 2 ./ex1 -malloc_log | grep IS
>
> dof = 2
>
> [0] 1 20400 ISBlockSetIndices_Block()
> [0] 15 3760 ISCreate()
> [0] 4 128 ISCreate_Block()
> [0] 1 16 ISCreate_Stride()
> [0] 2 81600 ISGetIndices_Block()
> [0] 1 20400 ISLocalToGlobalMappingBlock()
> [0] 7 42016 ISLocalToGlobalMappingCreate()
>
> dof = 4
>
> ~/Src/petsc/test master $ petscmpiexec -n 2 ./ex1 -malloc_log | grep IS
> [0] 1 20400 ISBlockSetIndices_Block()
> [0] 15 3760 ISCreate()
> [0] 4 128 ISCreate_Block()
> [0] 1 16 ISCreate_Stride()
> [0] 2 163200 ISGetIndices_Block()
> [0] 1 20400 ISLocalToGlobalMappingBlock()
> [0] 7 82816 ISLocalToGlobalMappingCreate()
>
> dof = 8
>
> ~/Src/petsc/test master $ petscmpiexec -n 2 ./ex1 -malloc_log | grep IS
> [0] 1 20400 ISBlockSetIndices_Block()
> [0] 15 3760 ISCreate()
> [0] 4 128 ISCreate_Block()
> [0] 1 16 ISCreate_Stride()
> [0] 2 326400 ISGetIndices_Block()
> [0] 1 20400 ISLocalToGlobalMappingBlock()
> [0] 7 164416 ISLocalToGlobalMappingCreate()
>
> dof = 12
> ~/Src/petsc/test master $ petscmpiexec -n 2 ./ex1 -malloc_log | grep IS
> [0] 1 20400 ISBlockSetIndices_Block()
> [0] 15 3760 ISCreate()
> [0] 4 128 ISCreate_Block()
> [0] 1 16 ISCreate_Stride()
> [0] 2 489600 ISGetIndices_Block()
> [0] 1 20400 ISLocalToGlobalMappingBlock()
> [0] 7 246016 ISLocalToGlobalMappingCreate()
>
> Here the accessing of indices is at the point level (as well as block) and
> hence memory usage is proportional to dof* local number of grid points. Of
> course it is still only proportional to the vector size. There is some
> improvement we could make it; with a lot of refactoring we can remove the
> dof* completely, with a little refactoring we can bring it down to a single
> dof*local number of grid points.
>
> I cannot understand why you are seeing memory usage 7 times more than a
> vector. That seems like a lot.
>
> Barry
>
>
>
> On Oct 21, 2013, at 11:32 AM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>
> >
> > The PETSc DMDA object greedily allocates several arrays of data used
> to set up the communication and other things like local to global mappings
> even before you create any vectors. This is why you see this big bump in
> memory usage.
> >
> > BUT I don't think it should be any worse in 3.4 than in 3.3 or
> earlier; at least we did not intend to make it worse. Are you sure it is
> using more memory than in 3.3
> >
> > In order for use to decrease the memory usage of the DMDA setup it
> would be helpful if we knew which objects created within it used the most
> memory. There is some sloppiness in that routine of not reusing memory as
> well as could be, not sure how much difference that would make.
> >
> >
> > Barry
> >
> >
> >
> > On Oct 21, 2013, at 7:02 AM, Juha Jäykkä <juhaj at iki.fi> wrote:
> >
> >> Dear list members,
> >>
> >> I have noticed strange memory consumption after upgrading to 3.4
> series. I
> >> never had time to properly investigate, but here is what happens [yes,
> this
> >> might be a petsc4py issue, but I doubt it] is
> >>
> >> # helpers contains _ProcessMemoryInfoProc routine which just digs the
> memory
> >> # usage data from /proc
> >> import helpers
> >> procdata=helpers._ProcessMemoryInfoProc()
> >> print procdata.rss/2**20, "MiB /", procdata.os_specific[3][1]
> >> from petsc4py import PETSc
> >> procdata=helpers._ProcessMemoryInfoProc()
> >> print procdata.rss/2**20, "MiB /", procdata.os_specific[3][1]
> >> da = PETSc.DA().create(sizes=[100,100,100],
> >>
> proc_sizes=[PETSc.DECIDE,PETSc.DECIDE,PETSc.DECIDE],
> >> boundary_type=[3,0,0],
> >> stencil_type=PETSc.DA.StencilType.BOX,
> >> dof=7, stencil_width=1, comm=PETSc.COMM_WORLD)
> >> procdata=helpers._ProcessMemoryInfoProc()
> >> print procdata.rss/2**20, "MiB /", procdata.os_specific[3][1]
> >> vec=da.createGlobalVec()
> >> procdata=helpers._ProcessMemoryInfoProc()
> >> print procdata.rss/2**20, "MiB /", procdata.os_specific[3][1]
> >>
> >> outputs
> >>
> >> 48 MiB / 49348 kB
> >> 48 MiB / 49360 kB
> >> 381 MiB / 446228 kB
> >> 435 MiB / 446228 kB
> >>
> >> Which is odd: size of the actual data to be stored in the da is just
> about 56
> >> megabytes, so why does creating the da consume 7 times that? And why
> does the
> >> DA reserve the memory in the first place? I thought memory only gets
> allocated
> >> once an associated vector is created and it indeed looks like the
> >> createGlobalVec call does indeed allocate the right amount of data. But
> what
> >> is that 330 MiB that DA().create() consumes? [It's actually the .setUp()
> >> method that does the consuming, but that's not of much use as it needs
> to be
> >> called before a vector can be created.]
> >>
> >> Cheers,
> >> Juha
> >>
> >
>
>
--
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20131021/4a5922ce/attachment.html>
More information about the petsc-users
mailing list