[petsc-dev] making DA more light weight

Thu May 15 17:22:34 CDT 2014

  I suggest the following:  

    ISLocalToGlobalCreate() takes a block size. 

    ISLocalToGlobalApply() will then do the integer arithmetic if bs > 1

    ISLocalToGlobalApplyBlock() will just use the given indices 

   Then Mat/VecSetLocalToGlobal() will only have one mapping and DMDACreate() will only create the block version 

    If this sounds reasonable I will make it so.

   Barry

Note that DMDACreate() actually creates the block version internally and it only gets “stretched” to create the ISLocalToGlobal() so the problem was never with the DMDA creation it was with the bad design and usage of block and non-blocked ISLocalToGlobal()

On May 15, 2014, at 2:27 PM, Jed Brown <jed at jedbrown.org> wrote:

> Barry Smith <bsmith at mcs.anl.gov> writes:
>>   Hmm, this is the sequential case where no optimization was done for
>>   block indices (adding additional code to handle the blocks would
>>   not be that difficult). In the parallel case if the indices are
>>   block then ISGetIndices() is not suppose to ever be used (is it?)
>>   instead only ISBlockGetIndices() is used.
>> 
>>   Can this plot be produced for the parallel case?
> 
> $ PETSC_ARCH=mpich-opt mpirun.hydra -n 1 python2 -m memory_profiler ketch-dmda.py 128 3 1 : -n 1 python2 ketch-dmda.py 128 3 1
> Filename: ketch-dmda.py
> 
> Line #    Mem usage    Increment   Line Contents
> ================================================
>    11   23.336 MiB    0.000 MiB   @profile
>    12                             def foo(size=128,ndim=3,dof=1):
>    13   51.688 MiB   28.352 MiB       da = PETSc.DA().create(sizes=[size]*ndim,dof=dof)
>    14   59.711 MiB    8.023 MiB       q1 = da.createGlobalVec()
>    15   67.715 MiB    8.004 MiB       q2 = da.createGlobalVec()
>    16   75.719 MiB    8.004 MiB       q3 = da.createGlobalVec()
> 
> $ PETSC_ARCH=mpich-opt mpirun.hydra -n 1 python2 -m memory_profiler ketch-dmda.py 128 3 10 : -n 1 python2 ketch-dmda.py 128 3 10
> Filename: ketch-dmda.py
> 
> Line #    Mem usage    Increment   Line Contents
> ================================================
>    11   23.336 MiB    0.000 MiB   @profile
>    12                             def foo(size=128,ndim=3,dof=1):
>    13  235.711 MiB  212.375 MiB       da = PETSc.DA().create(sizes=[size]*ndim,dof=dof)
>    14  315.734 MiB   80.023 MiB       q1 = da.createGlobalVec()
>    15  395.738 MiB   80.004 MiB       q2 = da.createGlobalVec()
>    16  475.742 MiB   80.004 MiB       q3 = da.createGlobalVec()
> 
> 
> So creating the DMDA still costs 2.5x as much as a Vec.  See here for
> the massif-visualizer plot:
> 
>  http://59A2.org/files/dmda-memory-p2.png
> 
>  $ PETSC_ARCH=mpich-opt mpirun.hydra -n 1 valgrind --tool=massif python2 ketch-dmda.py 128 3 10 : -n 1 python2 ketch-dmda.py 128 3 10                                                                         
>  ==3243== Massif, a heap profiler
>  ==3243== Copyright (C) 2003-2013, and GNU GPL'd, by Nicholas Nethercote
>  ==3243== Using Valgrind-3.9.0 and LibVEX; rerun with -h for copyright info
>  ==3243== Command: python2 ketch-dmda.py 128 3 10
>  ==3243== 
>  ==3243== 
> 
> Note that ISGetIndices is still called in the parallel case.
> 
> <ketch-dmda.py>