[petsc-users] Valgrind Issue With Ghosted Vectors

Balay, Satish balay at mcs.anl.gov
Mon Mar 25 15:59:26 CDT 2019


I see you are using " 0e667e8fea4aa from December 23rd" - which is old
petsc 'master' snapshot.

1. After your fix for 'bad input file' - do you still get these
valgrind messages?

2. You should be able to easily apply Stefano's potential fix to your
snapshot [without upgrading to latest petsc].

git cherry-pick 0b85991cae8259fd283ce3f99b399b38f1dcd7b4

And then rebuild petsc and - rerun with valgrind - and see if the
messages persist.

Satish


On Mon, 25 Mar 2019, Derek Gaston via petsc-users wrote:

> Stefano: the stupidity was all mine and had nothing to do with PETSc.
> Valgrind helped me track down a memory corruption issue that ultimately was
> just about a bad input file to my code (and obviously not enough error
> checking for input files!).
> 
> The issue is fixed.
> 
> Now - I'd like to understand a bit more about what happened here on the
> PETSc side.  Was this valgrind issue something that was known and you
> already had a fix for it - but it wasn't on maint yet?  Or was it just that
> I was using too old of a version of PETSc so I didn't have the fix?
> 
> Derek
> 
> On Fri, Mar 22, 2019 at 4:29 AM Stefano Zampini <stefano.zampini at gmail.com>
> wrote:
> 
> >
> >
> > On Mar 21, 2019, at 7:59 PM, Derek Gaston <friedmud at gmail.com> wrote:
> >
> > It sounds like you already tracked this down... but for completeness here
> > is what track-origins gives:
> >
> > ==262923== Conditional jump or move depends on uninitialised value(s)
> > ==262923==    at 0x73C6548: VecScatterMemcpyPlanCreate_Index (vscat.c:294)
> > ==262923==    by 0x73DBD97: VecScatterMemcpyPlanCreate_PtoP
> > (vpscat_mpi1.c:312)
> > ==262923==    by 0x73DE6AE: VecScatterCreateCommon_PtoS_MPI1
> > (vpscat_mpi1.c:2328)
> > ==262923==    by 0x73DFFEA: VecScatterCreateLocal_PtoS_MPI1
> > (vpscat_mpi1.c:2202)
> > ==262923==    by 0x73C7A51: VecScatterCreate_PtoS (vscat.c:608)
> > ==262923==    by 0x73C9E8A: VecScatterSetUp_vectype_private (vscat.c:857)
> > ==262923==    by 0x73CBE5D: VecScatterSetUp_MPI1 (vpscat_mpi1.c:2543)
> > ==262923==    by 0x7413D39: VecScatterSetUp (vscatfce.c:212)
> > ==262923==    by 0x7412D73: VecScatterCreateWithData (vscreate.c:333)
> > ==262923==    by 0x747A232: VecCreateGhostWithArray (pbvec.c:685)
> > ==262923==    by 0x747A90D: VecCreateGhost (pbvec.c:741)
> > ==262923==    by 0x5C7FFD6: libMesh::PetscVector<double>::init(unsigned
> > long, unsigned long, std::vector<unsigned long, std::allocator<unsigned
> > long> > const&, bool, libMesh::ParallelType) (petsc_vector.h:752)
> > ==262923==  Uninitialised value was created by a heap allocation
> > ==262923==    at 0x402DDC6: memalign (vg_replace_malloc.c:899)
> > ==262923==    by 0x7359702: PetscMallocAlign (mal.c:41)
> > ==262923==    by 0x7359C70: PetscMallocA (mal.c:390)
> > ==262923==    by 0x73DECF0: VecScatterCreateLocal_PtoS_MPI1
> > (vpscat_mpi1.c:2061)
> > ==262923==    by 0x73C7A51: VecScatterCreate_PtoS (vscat.c:608)
> > ==262923==    by 0x73C9E8A: VecScatterSetUp_vectype_private (vscat.c:857)
> > ==262923==    by 0x73CBE5D: VecScatterSetUp_MPI1 (vpscat_mpi1.c:2543)
> > ==262923==    by 0x7413D39: VecScatterSetUp (vscatfce.c:212)
> > ==262923==    by 0x7412D73: VecScatterCreateWithData (vscreate.c:333)
> > ==262923==    by 0x747A232: VecCreateGhostWithArray (pbvec.c:685)
> > ==262923==    by 0x747A90D: VecCreateGhost (pbvec.c:741)
> > ==262923==    by 0x5C7FFD6: libMesh::PetscVector<double>::init(unsigned
> > long, unsigned long, std::vector<unsigned long, std::allocator<unsigned
> > long> > const&, bool, libMesh::ParallelType) (petsc_vector.h:752)
> >
> >
> > BTW: This turned out not to be my actual problem.  My actual problem was
> > just some stupidity on my part... just a simple input parameter issue to my
> > code (should have had better error checking!).
> >
> > But: It sounds like my digging may have uncovered something real here...
> > so it wasn't completely useless :-)
> >
> >
> > Derek,
> >
> > I don’t understand. Is your problem fixed or not? Would be nice to
> > understand what was the “some stupidity on your part”, and if it was still
> > leading to valid PETSc code or just to a broken setup.
> > In the first case, then we should investigate the valgrind error you
> > reported.
> > In the second case, this is not a PETSc issue.
> >
> >
> > Thanks for your help everyone!
> >
> > Derek
> >
> >
> >
> > On Thu, Mar 21, 2019 at 10:38 AM Stefano Zampini <
> > stefano.zampini at gmail.com> wrote:
> >
> >>
> >>
> >> Il giorno mer 20 mar 2019 alle ore 23:40 Derek Gaston via petsc-users <
> >> petsc-users at mcs.anl.gov> ha scritto:
> >>
> >>> Trying to track down some memory corruption I'm seeing on larger scale
> >>> runs (3.5B+ unknowns).
> >>>
> >>
> >> Uhm.... are you using 32bit indices? is it possible there's integer
> >> overflow somewhere?
> >>
> >>
> >>
> >>> Was able to run Valgrind on it... and I'm seeing quite a lot of
> >>> uninitialized value errors coming from ghost updating.  Here are some of
> >>> the traces:
> >>>
> >>> ==87695== Conditional jump or move depends on uninitialised value(s)
> >>> ==87695==    at 0x73236D3: PetscMallocAlign (mal.c:28)
> >>> ==87695==    by 0x7323C70: PetscMallocA (mal.c:390)
> >>> ==87695==    by 0x739048E: VecScatterMemcpyPlanCreate_Index (vscat.c:284)
> >>> ==87695==    by 0x73A5D97: VecScatterMemcpyPlanCreate_PtoP
> >>> (vpscat_mpi1.c:312)
> >>> ==64730==    by 0x7393E8A: VecScatterSetUp_vectype_private (vscat.c:857)
> >>> ==64730==    by 0x7395E5D: VecScatterSetUp_MPI1 (vpscat_mpi1.c:2543)
> >>> ==64730==    by 0x73DDD39: VecScatterSetUp (vscatfce.c:212)
> >>> ==64730==    by 0x73DCD73: VecScatterCreateWithData (vscreate.c:333)
> >>> ==64730==    by 0x7444232: VecCreateGhostWithArray (pbvec.c:685)
> >>> ==64730==    by 0x744490D: VecCreateGhost (pbvec.c:741)
> >>>
> >>> ==133582== Conditional jump or move depends on uninitialised value(s)
> >>> ==133582==    at 0x4030384: memcpy@@GLIBC_2.14
> >>> (vg_replace_strmem.c:1034)
> >>> ==133582==    by 0x739E4F9: PetscMemcpy (petscsys.h:1649)
> >>> ==133582==    by 0x739E4F9: VecScatterMemcpyPlanExecute_Pack
> >>> (vecscatterimpl.h:150)
> >>> ==133582==    by 0x739E4F9: VecScatterBeginMPI1_1 (vpscat_mpi1.h:69)
> >>> ==133582==    by 0x73DD964: VecScatterBegin (vscatfce.c:110)
> >>> ==133582==    by 0x744E195: VecGhostUpdateBegin (commonmpvec.c:225)
> >>>
> >>> This is from a Git checkout of PETSc... the hash I branched from is:
> >>> 0e667e8fea4aa from December 23rd (updating would be really hard at this
> >>> point as I've completed 90% of my dissertation with this version... and
> >>> changing PETSc now would be pretty painful!).
> >>>
> >>> Any ideas?  Is it possible it's in my code?  Is it possible that there
> >>> are later PETSc commits that already fix this?
> >>>
> >>> Thanks for any help,
> >>> Derek
> >>>
> >>>
> >>
> >> --
> >> Stefano
> >>
> >
> >
> 


More information about the petsc-users mailing list