[petsc-users] Valgrind Issue With Ghosted Vectors

Stefano Zampini stefano.zampini at gmail.com
Thu Mar 21 03:00:55 CDT 2019


Derek

I have fixed the optimized plan few weeks ago

https://bitbucket.org/petsc/petsc/commits/c3caad8634d376283f7053f3b388606b45b3122c

Maybe this will fix your problem too?

Stefano


Il Gio 21 Mar 2019, 04:21 Zhang, Junchao via petsc-users <
petsc-users at mcs.anl.gov> ha scritto:

> Hi, Derek,
>   Try to apply this tiny (but dirty) patch on your version of PETSc to
> disable the VecScatterMemcpyPlan optimization to see if it helps.
>   Thanks.
> --Junchao Zhang
>
> On Wed, Mar 20, 2019 at 6:33 PM Junchao Zhang <jczhang at mcs.anl.gov> wrote:
>
>> Did you see the warning with small scale runs?  Is it possible to provide
>> a test code?
>> You mentioned "changing PETSc now would be pretty painful". Is it because
>> it will affect your performance (but not your code)?  If yes, could you try
>> PETSc master and run you code with or without -vecscatter_type sf.  I want
>> to isolate the problem and see if it is due to possible bugs in VecScatter.
>> If the above suggestion is not feasible, I will disable VecScatterMemcpy.
>> It is an optimization I added. Sorry I did not have an option to turn off
>> it because I thought it was always useful:)  I will provide you a patch
>> later to disable it. With that you can run again to isolate possible bugs
>> in VecScatterMemcpy.
>> Thanks.
>> --Junchao Zhang
>>
>>
>> On Wed, Mar 20, 2019 at 5:40 PM Derek Gaston via petsc-users <
>> petsc-users at mcs.anl.gov> wrote:
>>
>>> Trying to track down some memory corruption I'm seeing on larger scale
>>> runs (3.5B+ unknowns).  Was able to run Valgrind on it... and I'm seeing
>>> quite a lot of uninitialized value errors coming from ghost updating.  Here
>>> are some of the traces:
>>>
>>> ==87695== Conditional jump or move depends on uninitialised value(s)
>>> ==87695==    at 0x73236D3: PetscMallocAlign (mal.c:28)
>>> ==87695==    by 0x7323C70: PetscMallocA (mal.c:390)
>>> ==87695==    by 0x739048E: VecScatterMemcpyPlanCreate_Index (vscat.c:284)
>>> ==87695==    by 0x73A5D97: VecScatterMemcpyPlanCreate_PtoP
>>> (vpscat_mpi1.c:312)
>>> ==64730==    by 0x7393E8A: VecScatterSetUp_vectype_private (vscat.c:857)
>>> ==64730==    by 0x7395E5D: VecScatterSetUp_MPI1 (vpscat_mpi1.c:2543)
>>> ==64730==    by 0x73DDD39: VecScatterSetUp (vscatfce.c:212)
>>> ==64730==    by 0x73DCD73: VecScatterCreateWithData (vscreate.c:333)
>>> ==64730==    by 0x7444232: VecCreateGhostWithArray (pbvec.c:685)
>>> ==64730==    by 0x744490D: VecCreateGhost (pbvec.c:741)
>>>
>>> ==133582== Conditional jump or move depends on uninitialised value(s)
>>> ==133582==    at 0x4030384: memcpy@@GLIBC_2.14
>>> (vg_replace_strmem.c:1034)
>>> ==133582==    by 0x739E4F9: PetscMemcpy (petscsys.h:1649)
>>> ==133582==    by 0x739E4F9: VecScatterMemcpyPlanExecute_Pack
>>> (vecscatterimpl.h:150)
>>> ==133582==    by 0x739E4F9: VecScatterBeginMPI1_1 (vpscat_mpi1.h:69)
>>> ==133582==    by 0x73DD964: VecScatterBegin (vscatfce.c:110)
>>> ==133582==    by 0x744E195: VecGhostUpdateBegin (commonmpvec.c:225)
>>>
>>> This is from a Git checkout of PETSc... the hash I branched from is:
>>> 0e667e8fea4aa from December 23rd (updating would be really hard at this
>>> point as I've completed 90% of my dissertation with this version... and
>>> changing PETSc now would be pretty painful!).
>>>
>>> Any ideas?  Is it possible it's in my code?  Is it possible that there
>>> are later PETSc commits that already fix this?
>>>
>>> Thanks for any help,
>>> Derek
>>>
>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20190321/441634c7/attachment-0001.html>


More information about the petsc-users mailing list