[petsc-dev] problem with your DMCountNonCyclicReferences code?

Matthew Knepley knepley at gmail.com
Wed Mar 16 07:05:49 CDT 2016


On Wed, Mar 16, 2016 at 1:21 AM, Tobin Isaac <tisaac at uchicago.edu> wrote:

> On Tue, Mar 15, 2016 at 11:47:53PM -0500, Barry Smith wrote:
> >
> >    This is a really nasty problem. The example as previously written was
> completely reasonable, so your fix is a total hack :-).  All the circular
> reference counting in PETSc is problematic because it is so dependent on
> exactly the details of how each particular object and its relationships are
> handled.
>
> I agree that the need to call VecSetDM() in that case is bad, and it
> stems from assuming that the recycled vectors reference the dm: if
> we're going to count circular references, we should actually count
> them instead of assuming they exist.
>
> Where I added DMDestroy() in the Coarsen() routine, however, was in
> line with the kind of code we typically expect from users.
>
> >
> >    Do we really need to even allow these nasty circular relationships to
> exist? What would we lose if we, for example, removed the two way
> relationships between the DMs and the Vecs? Just a little efficiency in not
> needing to create new Vecs because we can recycle them? But at the cost of
> very difficult to debug code that  "should just work?" Similarly the nasty
> circular dependencies with dm->coarseMesh is done for "efficiency", is
> there a way to keep the efficiency but not the tricking dependencies?
>
> I introduced dm->fineMesh, and I'll consider removing it, but having
> both dm->coarseMesh and dm->fineMesh references is about more than
> just efficiency.  Particularly with the inverted multigrid that
> everyone's working on, there are workflows where it is more natural
> for the user to just maintain a handle on the coarsest mesh, not the
> finest mesh.


I think all the references here are completely appropriate. I don't see
another way of making
many things work than to have the DM know its pool of named vectors. I
think it may be that
our simplistic reference counting scheme is at fault.

However, in this case, I think its clear that your function violated the
implied contract for
DMCreateGloba/LocalVector(). This should be put in the documentation that
the returned
vectors need to have the DM set to that DM.

  Matt


> >
> >   I accept your "fix", thanks for figuring it out so quickly! but don't
> like it :-).
> >
> >    Barry
> >
> >
> >
> > > On Mar 15, 2016, at 11:30 PM, Tobin Isaac <tisaac at uchicago.edu> wrote:
> > >
> > >
> > > I pushed a fix.  There's a long explanation in the commit message:
> > > while this could be called user error, the cycle counting isn't very
> > > robust and should probably be changed.
> > >
> > >  Toby
> > >
> > > On Tue, Mar 15, 2016 at 09:54:53PM -0500, Barry Smith wrote:
> > >>
> > >>  Dang, dang, dang, I can't believe I fell for that git trapdoor. Ok
> pushed now.
> > >>
> > >>  Barry
> > >>
> > >>> On Mar 15, 2016, at 9:46 PM, Tobin Isaac <tisaac at uchicago.edu>
> wrote:
> > >>>
> > >>>
> > >>> Barry, please check in ex65.c
> > >>>
> > >>> On Sun, Mar 13, 2016 at 04:20:06PM -0500, Barry Smith wrote:
> > >>>>
> > >>>> Toby,
> > >>>>
> > >>>>  I'm trying to put together a very simple but complete DMSHELL
> example for popov at uni-mainz.de  and having some trouble which I think it
> might point to a bug or logical error in the code you wrote for maintaining
> dm->coarseMesh and dm->fineMesh and stuff.
> > >>>>
> > >>>> $ petscmpiexec -valgrind -n 1 ./ex65 -pc_type mg -pc_mg_levels 2
> > >>>> ==80209== Invalid read of size 8
> > >>>> ==80209==    at 0x100A9E2D5: DMCountNonCyclicReferences (dm.c:500)
> > >>>> ==80209==    by 0x100A8F70A: DMDestroy (dm.c:573)
> > >>>> ==80209==    by 0x101221BBE: KSPDestroy (itfunc.c:985)
> > >>>> ==80209==    by 0x1010BCBFC: PCDestroy_MG (mg.c:302)
> > >>>> ==80209==    by 0x1010E23F7: PCDestroy (precon.c:122)
> > >>>> ==80209==    by 0x101221C3A: KSPDestroy (itfunc.c:986)
> > >>>> ==80209==    by 0x100001C4C: main (in ./ex65)
> > >>>> ==80209==  Address 0x10398fd68 is 5,864 bytes inside a block of
> size 6,196 free'd
> > >>>> ==80209==    at 0x10001595D: free (vg_replace_malloc.c:480)
> > >>>> ==80209==    by 0x1000FE393: PetscFreeAlign (mal.c:72)
> > >>>> ==80209==    by 0x100100D1E: PetscTrFreeDefault (mtr.c:315)
> > >>>> ==80209==    by 0x100A91C5A: DMDestroy (dm.c:716)
> > >>>> ==80209==    by 0x1010E2478: PCDestroy (precon.c:123)
> > >>>> ==80209==    by 0x101221C3A: KSPDestroy (itfunc.c:986)
> > >>>> ==80209==    by 0x1010BCBFC: PCDestroy_MG (mg.c:302)
> > >>>> ==80209==    by 0x1010E23F7: PCDestroy (precon.c:122)
> > >>>> ==80209==    by 0x101221C3A: KSPDestroy (itfunc.c:986)
> > >>>> ==80209==    by 0x100001C4C: main (in ./ex65)
> > >>>> ==80209==
> > >>>> ==80209== Invalid read of size 8
> > >>>> ==80209==    at 0x100A9E2D5: DMCountNonCyclicReferences (dm.c:500)
> > >>>> ==80209==    by 0x100A8F70A: DMDestroy (dm.c:573)
> > >>>> ==80209==    by 0x1010E2478: PCDestroy (precon.c:123)
> > >>>> ==80209==    by 0x101221C3A: KSPDestroy (itfunc.c:986)
> > >>>> ==80209==    by 0x1010BCBFC: PCDestroy_MG (mg.c:302)
> > >>>> ==80209==    by 0x1010E23F7: PCDestroy (precon.c:122)
> > >>>> ==80209==    by 0x101221C3A: KSPDestroy (itfunc.c:986)
> > >>>> ==80209==    by 0x100001C4C: main (in ./ex65)
> > >>>> ==80209==  Address 0x10398fd68 is 5,864 bytes inside a block of
> size 6,196 free'd
> > >>>> ==80209==    at 0x10001595D: free (vg_replace_malloc.c:480)
> > >>>> ==80209==    by 0x1000FE393: PetscFreeAlign (mal.c:72)
> > >>>> ==80209==    by 0x100100D1E: PetscTrFreeDefault (mtr.c:315)
> > >>>> ==80209==    by 0x100A91C5A: DMDestroy (dm.c:716)
> > >>>> ==80209==    by 0x1010E2478: PCDestroy (precon.c:123)
> > >>>> ==80209==    by 0x101221C3A: KSPDestroy (itfunc.c:986)
> > >>>> ==80209==    by 0x1010BCBFC: PCDestroy_MG (mg.c:302)
> > >>>> ==80209==    by 0x1010E23F7: PCDestroy (precon.c:122)
> > >>>> ==80209==    by 0x101221C3A: KSPDestroy (itfunc.c:986)
> > >>>> ==80209==    by 0x100001C4C: main (in ./ex65)
> > >>>> ==80209==
> > >>>> ==80209== Invalid read of size 8
> > >>>> ==80209==    at 0x100A9E2D5: DMCountNonCyclicReferences (dm.c:500)
> > >>>> ==80209==    by 0x100A8F70A: DMDestroy (dm.c:573)
> > >>>> ==80209==    by 0x100001CBC: main (in ./ex65)
> > >>>> ==80209==  Address 0x10398fd68 is 5,864 bytes inside a block of
> size 6,196 free'd
> > >>>> ==80209==    at 0x10001595D: free (vg_replace_malloc.c:480)
> > >>>> ==80209==    by 0x1000FE393: PetscFreeAlign (mal.c:72)
> > >>>> ==80209==    by 0x100100D1E: PetscTrFreeDefault (mtr.c:315)
> > >>>> ==80209==    by 0x100A91C5A: DMDestroy (dm.c:716)
> > >>>> ==80209==    by 0x1010E2478: PCDestroy (precon.c:123)
> > >>>> ==80209==    by 0x101221C3A: KSPDestroy (itfunc.c:986)
> > >>>> ==80209==    by 0x1010BCBFC: PCDestroy_MG (mg.c:302)
> > >>>> ==80209==    by 0x1010E23F7: PCDestroy (precon.c:122)
> > >>>> ==80209==    by 0x101221C3A: KSPDestroy (itfunc.c:986)
> > >>>> ==80209==    by 0x100001C4C: main (in ./ex65)
> > >>>> ==80209==
> > >>>> ==80209== Invalid read of size 8
> > >>>> ==80209==    at 0x100A914C4: DMDestroy (dm.c:696)
> > >>>> ==80209==    by 0x100001CBC: main (in ./ex65)
> > >>>> ==80209==  Address 0x10398fd68 is 5,864 bytes inside a block of
> size 6,196 free'd
> > >>>> ==80209==    at 0x10001595D: free (vg_replace_malloc.c:480)
> > >>>> ==80209==    by 0x1000FE393: PetscFreeAlign (mal.c:72)
> > >>>> ==80209==    by 0x100100D1E: PetscTrFreeDefault (mtr.c:315)
> > >>>> ==80209==    by 0x100A91C5A: DMDestroy (dm.c:716)
> > >>>> ==80209==    by 0x1010E2478: PCDestroy (precon.c:123)
> > >>>> ==80209==    by 0x101221C3A: KSPDestroy (itfunc.c:986)
> > >>>> ==80209==    by 0x1010BCBFC: PCDestroy_MG (mg.c:302)
> > >>>> ==80209==    by 0x1010E23F7: PCDestroy (precon.c:122)
> > >>>> ==80209==    by 0x101221C3A: KSPDestroy (itfunc.c:986)
> > >>>> ==80209==    by 0x100001C4C: main (in ./ex65)
> > >>>> ==80209==
> > >>>> ==80209== Invalid read of size 4
> > >>>> ==80209==    at 0x1002319B4: PetscCheckPointer (checkptr.c:106)
> > >>>> ==80209==    by 0x100A8F5C6: DMDestroy (dm.c:570)
> > >>>> ==80209==    by 0x100A9156F: DMDestroy (dm.c:699)
> > >>>> ==80209==    by 0x100001CBC: main (in ./ex65)
> > >>>> ==80209==  Address 0x10398ece0 is 1,632 bytes inside a block of
> size 6,196 free'd
> > >>>> ==80209==    at 0x10001595D: free (vg_replace_malloc.c:480)
> > >>>> ==80209==    by 0x1000FE393: PetscFreeAlign (mal.c:72)
> > >>>> ==80209==    by 0x100100D1E: PetscTrFreeDefault (mtr.c:315)
> > >>>> ==80209==    by 0x100A91C5A: DMDestroy (dm.c:716)
> > >>>> ==80209==    by 0x1010E2478: PCDestroy (precon.c:123)
> > >>>> ==80209==    by 0x101221C3A: KSPDestroy (itfunc.c:986)
> > >>>> ==80209==    by 0x1010BCBFC: PCDestroy_MG (mg.c:302)
> > >>>> ==80209==    by 0x1010E23F7: PCDestroy (precon.c:122)
> > >>>> ==80209==    by 0x101221C3A: KSPDestroy (itfunc.c:986)
> > >>>> ==80209==    by 0x100001C4C: main (in ./ex65)
> > >>>> ==80209==
> > >>>> ==80209== Invalid read of size 4
> > >>>> ==80209==    at 0x100A8F630: DMDestroy (dm.c:570)
> > >>>> ==80209==    by 0x100A9156F: DMDestroy (dm.c:699)
> > >>>> ==80209==    by 0x100001CBC: main (in ./ex65)
> > >>>> ==80209==  Address 0x10398ece0 is 1,632 bytes inside a block of
> size 6,196 free'd
> > >>>> ==80209==    at 0x10001595D: free (vg_replace_malloc.c:480)
> > >>>> ==80209==    by 0x1000FE393: PetscFreeAlign (mal.c:72)
> > >>>> ==80209==    by 0x100100D1E: PetscTrFreeDefault (mtr.c:315)
> > >>>> ==80209==    by 0x100A91C5A: DMDestroy (dm.c:716)
> > >>>> ==80209==    by 0x1010E2478: PCDestroy (precon.c:123)
> > >>>> ==80209==    by 0x101221C3A: KSPDestroy (itfunc.c:986)
> > >>>> ==80209==    by 0x1010BCBFC: PCDestroy_MG (mg.c:302)
> > >>>> ==80209==    by 0x1010E23F7: PCDestroy (precon.c:122)
> > >>>> ==80209==    by 0x101221C3A: KSPDestroy (itfunc.c:986)
> > >>>> ==80209==    by 0x100001C4C: main (in ./ex65)
> > >>>> ==80209==
> > >>>> ==80209== Invalid read of size 4
> > >>>> ==80209==    at 0x100A8F641: DMDestroy (dm.c:570)
> > >>>> ==80209==    by 0x100A9156F: DMDestroy (dm.c:699)
> > >>>> ==80209==    by 0x100001CBC: main (in ./ex65)
> > >>>> ==80209==  Address 0x10398ece0 is 1,632 bytes inside a block of
> size 6,196 free'd
> > >>>> ==80209==    at 0x10001595D: free (vg_replace_malloc.c:480)
> > >>>> ==80209==    by 0x1000FE393: PetscFreeAlign (mal.c:72)
> > >>>> ==80209==    by 0x100100D1E: PetscTrFreeDefault (mtr.c:315)
> > >>>> ==80209==    by 0x100A91C5A: DMDestroy (dm.c:716)
> > >>>> ==80209==    by 0x1010E2478: PCDestroy (precon.c:123)
> > >>>> ==80209==    by 0x101221C3A: KSPDestroy (itfunc.c:986)
> > >>>> ==80209==    by 0x1010BCBFC: PCDestroy_MG (mg.c:302)
> > >>>> ==80209==    by 0x1010E23F7: PCDestroy (precon.c:122)
> > >>>> ==80209==    by 0x101221C3A: KSPDestroy (itfunc.c:986)
> > >>>> ==80209==    by 0x100001C4C: main (in ./ex65)
> > >>>> ==80209==
> > >>>> [0]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
> > >>>> [0]PETSC ERROR: Invalid argument
> > >>>> [0]PETSC ERROR: Wrong type of object: Parameter # 1
> > >>>> [0]PETSC ERROR: See
> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> > >>>> [0]PETSC ERROR: Petsc Development GIT revision:
> pre-tsfc-829-g3974c78  GIT Date: 2016-03-11 17:51:48 -0600
> > >>>> [0]PETSC ERROR: ./ex65 on a arch-basic named
> Barrys-MacBook-Pro.local by barrysmith Sun Mar 13 16:13:10 2016
> > >>>> [0]PETSC ERROR: Configure options
> --with-mpi-dir=/Users/barrysmith/PetscLibraries PETSC_ARCH=arch-basic
> > >>>> [0]PETSC ERROR: #1 DMDestroy() line 570 in
> /Users/barrysmith/Src/petsc/src/dm/interface/dm.c
> > >>>> [0]PETSC ERROR: #2 DMDestroy() line 699 in
> /Users/barrysmith/Src/petsc/src/dm/interface/dm.c
> > >>>> [0]PETSC ERROR: #3 main() line 67 in
> /Users/barrysmith/Src/petsc/src/ksp/ksp/examples/tutorials/ex65.c
> > >>>> [0]PETSC ERROR: PETSc Option Table entries:
> > >>>> [0]PETSC ERROR: -malloc_test
> > >>>> [0]PETSC ERROR: -pc_mg_levels 2
> > >>>> [0]PETSC ERROR: -pc_type mg
> > >>>> [0]PETSC ERROR: ----------------End of Error Message -------send
> entire error message to petsc-maint at mcs.anl.gov----------
> > >>>>
> > >>>>  The code is in the branch barry/add-dmshellcreaterestriction
>  src/ksp/ksp/examples/tutorials/ex65.c  which creates a DMSHELL that just
> uses an inner DMDA1 to create the objects. The code is virtually identical
> to ex25.c which just uses the DMDA1d directly but does not crash. It seems
> to me that having the DM objects be shells instead of DMDA should make
> absolutely no difference in your logic for tracking dm->coarseMesh etc but
> somehow something is fishy!!!! I could have a mistake in my example code
> but I do not think so.
> > >>>>
> > >>>>   Could you please take a look at the problem, feel free to add
> fixes directly to the branch.
> > >>>>
> > >>>>  Thanks
> > >>>>
> > >>>>   Barry
> > >>>>
> > >>>>
> > >>
> >
>



-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20160316/7bfa94f4/attachment.html>


More information about the petsc-dev mailing list