[petsc-users] Valgrind Issue With Ghosted Vectors
Zhang, Junchao
jczhang at mcs.anl.gov
Thu Mar 21 16:01:44 CDT 2019
On Thu, Mar 21, 2019 at 1:57 PM Derek Gaston via petsc-users <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>> wrote:
It sounds like you already tracked this down... but for completeness here is what track-origins gives:
==262923== Conditional jump or move depends on uninitialised value(s)
==262923== at 0x73C6548: VecScatterMemcpyPlanCreate_Index (vscat.c:294)
==262923== by 0x73DBD97: VecScatterMemcpyPlanCreate_PtoP (vpscat_mpi1.c:312)
==262923== by 0x73DE6AE: VecScatterCreateCommon_PtoS_MPI1 (vpscat_mpi1.c:2328)
==262923== by 0x73DFFEA: VecScatterCreateLocal_PtoS_MPI1 (vpscat_mpi1.c:2202)
==262923== by 0x73C7A51: VecScatterCreate_PtoS (vscat.c:608)
==262923== by 0x73C9E8A: VecScatterSetUp_vectype_private (vscat.c:857)
==262923== by 0x73CBE5D: VecScatterSetUp_MPI1 (vpscat_mpi1.c:2543)
==262923== by 0x7413D39: VecScatterSetUp (vscatfce.c:212)
==262923== by 0x7412D73: VecScatterCreateWithData (vscreate.c:333)
==262923== by 0x747A232: VecCreateGhostWithArray (pbvec.c:685)
==262923== by 0x747A90D: VecCreateGhost (pbvec.c:741)
==262923== by 0x5C7FFD6: libMesh::PetscVector<double>::init(unsigned long, unsigned long, std::vector<unsigned long, std::allocator<unsigned long> > const&, bool, libMesh::ParallelType) (petsc_vector.h:752)
==262923== Uninitialised value was created by a heap allocation
I checked the code but could not figure out what was wrong. Perhaps you should use 64-bit integers and see whether the warning still exists. Please remember to incorporate Stefano's bug fix.
==262923== at 0x402DDC6: memalign (vg_replace_malloc.c:899)
==262923== by 0x7359702: PetscMallocAlign (mal.c:41)
==262923== by 0x7359C70: PetscMallocA (mal.c:390)
==262923== by 0x73DECF0: VecScatterCreateLocal_PtoS_MPI1 (vpscat_mpi1.c:2061)
==262923== by 0x73C7A51: VecScatterCreate_PtoS (vscat.c:608)
==262923== by 0x73C9E8A: VecScatterSetUp_vectype_private (vscat.c:857)
==262923== by 0x73CBE5D: VecScatterSetUp_MPI1 (vpscat_mpi1.c:2543)
==262923== by 0x7413D39: VecScatterSetUp (vscatfce.c:212)
==262923== by 0x7412D73: VecScatterCreateWithData (vscreate.c:333)
==262923== by 0x747A232: VecCreateGhostWithArray (pbvec.c:685)
==262923== by 0x747A90D: VecCreateGhost (pbvec.c:741)
==262923== by 0x5C7FFD6: libMesh::PetscVector<double>::init(unsigned long, unsigned long, std::vector<unsigned long, std::allocator<unsigned long> > const&, bool, libMesh::ParallelType) (petsc_vector.h:752)
BTW: This turned out not to be my actual problem. My actual problem was just some stupidity on my part... just a simple input parameter issue to my code (should have had better error checking!).
But: It sounds like my digging may have uncovered something real here... so it wasn't completely useless :-)
Thanks for your help everyone!
Derek
On Thu, Mar 21, 2019 at 10:38 AM Stefano Zampini <stefano.zampini at gmail.com<mailto:stefano.zampini at gmail.com>> wrote:
Il giorno mer 20 mar 2019 alle ore 23:40 Derek Gaston via petsc-users <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>> ha scritto:
Trying to track down some memory corruption I'm seeing on larger scale runs (3.5B+ unknowns).
Uhm.... are you using 32bit indices? is it possible there's integer overflow somewhere?
Was able to run Valgrind on it... and I'm seeing quite a lot of uninitialized value errors coming from ghost updating. Here are some of the traces:
==87695== Conditional jump or move depends on uninitialised value(s)
==87695== at 0x73236D3: PetscMallocAlign (mal.c:28)
==87695== by 0x7323C70: PetscMallocA (mal.c:390)
==87695== by 0x739048E: VecScatterMemcpyPlanCreate_Index (vscat.c:284)
==87695== by 0x73A5D97: VecScatterMemcpyPlanCreate_PtoP (vpscat_mpi1.c:312)
==64730== by 0x7393E8A: VecScatterSetUp_vectype_private (vscat.c:857)
==64730== by 0x7395E5D: VecScatterSetUp_MPI1 (vpscat_mpi1.c:2543)
==64730== by 0x73DDD39: VecScatterSetUp (vscatfce.c:212)
==64730== by 0x73DCD73: VecScatterCreateWithData (vscreate.c:333)
==64730== by 0x7444232: VecCreateGhostWithArray (pbvec.c:685)
==64730== by 0x744490D: VecCreateGhost (pbvec.c:741)
==133582== Conditional jump or move depends on uninitialised value(s)
==133582== at 0x4030384: memcpy@@GLIBC_2.14 (vg_replace_strmem.c:1034)
==133582== by 0x739E4F9: PetscMemcpy (petscsys.h:1649)
==133582== by 0x739E4F9: VecScatterMemcpyPlanExecute_Pack (vecscatterimpl.h:150)
==133582== by 0x739E4F9: VecScatterBeginMPI1_1 (vpscat_mpi1.h:69)
==133582== by 0x73DD964: VecScatterBegin (vscatfce.c:110)
==133582== by 0x744E195: VecGhostUpdateBegin (commonmpvec.c:225)
This is from a Git checkout of PETSc... the hash I branched from is: 0e667e8fea4aa from December 23rd (updating would be really hard at this point as I've completed 90% of my dissertation with this version... and changing PETSc now would be pretty painful!).
Any ideas? Is it possible it's in my code? Is it possible that there are later PETSc commits that already fix this?
Thanks for any help,
Derek
--
Stefano
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20190321/684f3b5d/attachment-0001.html>
More information about the petsc-users
mailing list