[petsc-dev] making DA more light weight
Barry Smith
bsmith at mcs.anl.gov
Thu May 15 16:52:30 CDT 2014
> Note that ISGetIndices is still called in the parallel case.
When bs is set > 1 it is not called in the VecScatterCreate() inside the DMDACreate 2d/3d ! Only ISBlockGetIndices() is called. The memory usage of VecScatterCreate() is O(dof*number of ghost points) + O(vector length/dof) in this case
It is called MULTIPLY times in the ISLocalToGlobalMappingCreateIS() inside the DMDACreate 2d/3d thus currently this causes small integer * vector length usage. As I keep telling you this is the problem area for dof > 1. For dof == 1 I can live with 2.5 *sizeof (vector)
You need a better test code to run with. Can you use ksp/ksp/examples/tests/ex42.c ? Or snes/examples/tutorials/ex19.c in parallel?
ISGetIndices() is called in VecScatterCreate() inside MatSetUpMultiply_MPIAIJ() because the off diagonal portion of the matrix is in generally not blocked.
Barry
On May 15, 2014, at 2:27 PM, Jed Brown <jed at jedbrown.org> wrote:
> Barry Smith <bsmith at mcs.anl.gov> writes:
>> Hmm, this is the sequential case where no optimization was done for
>> block indices (adding additional code to handle the blocks would
>> not be that difficult). In the parallel case if the indices are
>> block then ISGetIndices() is not suppose to ever be used (is it?)
>> instead only ISBlockGetIndices() is used.
>>
>> Can this plot be produced for the parallel case?
>
> $ PETSC_ARCH=mpich-opt mpirun.hydra -n 1 python2 -m memory_profiler ketch-dmda.py 128 3 1 : -n 1 python2 ketch-dmda.py 128 3 1
> Filename: ketch-dmda.py
>
> Line # Mem usage Increment Line Contents
> ================================================
> 11 23.336 MiB 0.000 MiB @profile
> 12 def foo(size=128,ndim=3,dof=1):
> 13 51.688 MiB 28.352 MiB da = PETSc.DA().create(sizes=[size]*ndim,dof=dof)
> 14 59.711 MiB 8.023 MiB q1 = da.createGlobalVec()
> 15 67.715 MiB 8.004 MiB q2 = da.createGlobalVec()
> 16 75.719 MiB 8.004 MiB q3 = da.createGlobalVec()
>
> $ PETSC_ARCH=mpich-opt mpirun.hydra -n 1 python2 -m memory_profiler ketch-dmda.py 128 3 10 : -n 1 python2 ketch-dmda.py 128 3 10
> Filename: ketch-dmda.py
>
> Line # Mem usage Increment Line Contents
> ================================================
> 11 23.336 MiB 0.000 MiB @profile
> 12 def foo(size=128,ndim=3,dof=1):
> 13 235.711 MiB 212.375 MiB da = PETSc.DA().create(sizes=[size]*ndim,dof=dof)
> 14 315.734 MiB 80.023 MiB q1 = da.createGlobalVec()
> 15 395.738 MiB 80.004 MiB q2 = da.createGlobalVec()
> 16 475.742 MiB 80.004 MiB q3 = da.createGlobalVec()
>
>
> So creating the DMDA still costs 2.5x as much as a Vec. See here for
> the massif-visualizer plot:
>
> http://59A2.org/files/dmda-memory-p2.png
>
> $ PETSC_ARCH=mpich-opt mpirun.hydra -n 1 valgrind --tool=massif python2 ketch-dmda.py 128 3 10 : -n 1 python2 ketch-dmda.py 128 3 10
> ==3243== Massif, a heap profiler
> ==3243== Copyright (C) 2003-2013, and GNU GPL'd, by Nicholas Nethercote
> ==3243== Using Valgrind-3.9.0 and LibVEX; rerun with -h for copyright info
> ==3243== Command: python2 ketch-dmda.py 128 3 10
> ==3243==
> ==3243==
>
> Note that ISGetIndices is still called in the parallel case.
>
> <ketch-dmda.py>
More information about the petsc-dev
mailing list