[petsc-dev] use of uninitialized values in PetscSF
Barry Smith
bsmith at mcs.anl.gov
Fri Sep 18 23:40:18 CDT 2015
This should be reported to MPICH as a bug in their "valgrind clean" version of MPICH so they should fix it.
Barry
> On Sep 18, 2015, at 11:24 PM, Satish Balay <balay at mcs.anl.gov> wrote:
>
> added a suppression entry for valgrind..
> https://bitbucket.org/petsc/petsc/commits/e0e2be56e148ea4b94b878d4467bec9beeac7a78
>
> Satish
>
> On Fri, 18 Sep 2015, Satish Balay wrote:
>
>> 2 options I could check.
>>
>> - switch to openmpi - and see if thats valgrind clean
>> - create a suppression file for mpich - for this error..
>> [bin/maint/petsc-val.supp - has an entry for a gfortran isse.
>> don't know if that issue still exists]
>>
>> Satish
>>
>> On Fri, 18 Sep 2015, Satish Balay wrote:
>>
>>> I just tried a build with
>>> http://www.mpich.org/static/downloads/3.2b4/mpich-3.2b4.tar.gz
>>> and still see the errors.
>>>
>>> Satish
>>>
>>> ----------
>>> balay at es^/scratch/balay/petsc/src/vec/is/sf/examples/tutorials(master=) $ /scratch/balay/petsc/arch-linux2-c-debug/bin/mpiexec -n 2 valgrind --tool=memcheck -q ./ex2 -sf_type window
>>> PetscSF Object: 2 MPI processes
>>> type: window
>>> synchronization=FENCE sort=rank-order
>>> [0] Number of roots=1, leaves=2, remote ranks=2
>>> [0] 0 <- (0,0)
>>> [0] 1 <- (1,0)
>>> [1] Number of roots=1, leaves=2, remote ranks=2
>>> [1] 0 <- (1,0)
>>> [1] 1 <- (0,0)
>>> ==13170== Syscall param writev(vector[...]) points to uninitialised byte(s)
>>> ==13170== at 0xA2EFD4B: writev (writev.c:51)
>>> ==13170== by 0x9C7D432: MPL_large_writev (mplsock.c:32)
>>> ==13170== by 0x9C6CD24: MPIDU_Sock_writev (sock_immed.i:610)
>>> ==13170== by 0x9C26793: MPIDI_CH3_iStartMsgv (ch3_istartmsgv.c:110)
>>> ==13170== by 0x9BB0184: issue_get_op (mpid_rma_issue.h:991)
>>> ==13170== by 0x9BB0923: issue_rma_op (mpid_rma_issue.h:1221)
>>> ==13170== by 0x9BB273F: issue_ops_target (ch3u_rma_progress.c:392)
>>> ==13170== by 0x9BB2DFB: issue_ops_win (ch3u_rma_progress.c:544)
>>> ==13170== by 0x9BB3FA4: MPIDI_CH3I_RMA_Make_progress_global (ch3u_rma_progress.c:972)
>>> ==13170== by 0x9C27A16: MPIDI_CH3i_Progress_wait (ch3_progress.c:192)
>>> ==13170== by 0x9C292B6: MPIDI_CH3I_Progress (ch3_progress.c:948)
>>> ==13170== by 0x9B34599: MPIC_Wait (helper_fns.c:225)
>>> ==13170== Address 0xc147574 is 4 bytes inside a block of size 168 alloc'd
>>> ==13170== at 0x4C2B6CD: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
>>> ==13170== by 0x9BAFFD5: issue_get_op (mpid_rma_issue.h:973)
>>> ==13170== by 0x9BB0923: issue_rma_op (mpid_rma_issue.h:1221)
>>> ==13170== by 0x9BB273F: issue_ops_target (ch3u_rma_progress.c:392)
>>> ==13170== by 0x9BB2DFB: issue_ops_win (ch3u_rma_progress.c:544)
>>> ==13170== by 0x9BB3FA4: MPIDI_CH3I_RMA_Make_progress_global (ch3u_rma_progress.c:972)
>>> ==13170== by 0x9C27A16: MPIDI_CH3i_Progress_wait (ch3_progress.c:192)
>>> ==13170== by 0x9C292B6: MPIDI_CH3I_Progress (ch3_progress.c:948)
>>> ==13170== by 0x9B34599: MPIC_Wait (helper_fns.c:225)
>>> ==13170== by 0x9B3491B: MPIC_Recv (helper_fns.c:355)
>>> ==13170== by 0x99AE746: MPIR_Bcast_binomial (bcast.c:234)
>>> ==13170== by 0x99B1309: MPIR_Bcast_intra (bcast.c:1283)
>>> ==13170==
>>> ==13169== Syscall param writev(vector[...]) points to uninitialised byte(s)
>>> ==13169== at 0xA2EFD4B: writev (writev.c:51)
>>> ==13169== by 0x9C7D432: MPL_large_writev (mplsock.c:32)
>>> ==13169== by 0x9C6CD24: MPIDU_Sock_writev (sock_immed.i:610)
>>> ==13169== by 0x9C26793: MPIDI_CH3_iStartMsgv (ch3_istartmsgv.c:110)
>>> ==13169== by 0x9BB0184: issue_get_op (mpid_rma_issue.h:991)
>>> ==13169== by 0x9BB0923: issue_rma_op (mpid_rma_issue.h:1221)
>>> ==13169== by 0x9BB273F: issue_ops_target (ch3u_rma_progress.c:392)
>>> ==13169== by 0x9BB2DFB: issue_ops_win (ch3u_rma_progress.c:544)
>>> ==13169== by 0x9BB3FA4: MPIDI_CH3I_RMA_Make_progress_global (ch3u_rma_progress.c:972)
>>> ==13169== by 0x9C27A16: MPIDI_CH3i_Progress_wait (ch3_progress.c:192)
>>> ==13169== by 0x9C292B6: MPIDI_CH3I_Progress (ch3_progress.c:948)
>>> ==13169== by 0x9B34599: MPIC_Wait (helper_fns.c:225)
>>> ==13169== Address 0xbcf90e4 is 4 bytes inside a block of size 168 alloc'd
>>> ==13169== at 0x4C2B6CD: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
>>> ==13169== by 0x9BAFFD5: issue_get_op (mpid_rma_issue.h:973)
>>> ==13169== by 0x9BB0923: issue_rma_op (mpid_rma_issue.h:1221)
>>> ==13169== by 0x9BB273F: issue_ops_target (ch3u_rma_progress.c:392)
>>> ==13169== by 0x9BB2DFB: issue_ops_win (ch3u_rma_progress.c:544)
>>> ==13169== by 0x9BB3FA4: MPIDI_CH3I_RMA_Make_progress_global (ch3u_rma_progress.c:972)
>>> ==13169== by 0x9C27A16: MPIDI_CH3i_Progress_wait (ch3_progress.c:192)
>>> ==13169== by 0x9C292B6: MPIDI_CH3I_Progress (ch3_progress.c:948)
>>> ==13169== by 0x9B34599: MPIC_Wait (helper_fns.c:225)
>>> ==13169== by 0x9B3491B: MPIC_Recv (helper_fns.c:355)
>>> ==13169== by 0x99B8B44: MPIR_Reduce_binomial (reduce.c:181)
>>> ==13169== by 0x99BAC95: MPIR_Reduce_intra (reduce.c:874)
>>> ==13169==
>>> Vec Object: 2 MPI processes
>>> type: mpi
>>> Process [0]
>>> 0
>>> 1
>>> Process [1]
>>> 1
>>> 0
>>> Vec Object: 2 MPI processes
>>> type: mpi
>>> Process [0]
>>> 10
>>> 11
>>> Process [1]
>>> 11
>>> 10
>>> balay at es^/scratch/balay/petsc/src/vec/is/sf/examples/tutorials(master=) $
>>>
>>>
>>> On Fri, 18 Sep 2015, Jed Brown wrote:
>>>
>>>> Barry Smith <bsmith at mcs.anl.gov> writes:
>>>>
>>>>> Is there any way to fix this use of uninitialized values?
>>>>
>>>> Hmm, I don't think it's something we have any control over. I could try
>>>> making a reduced test case to put in MPICH's test suite. Is this with
>>>> the latest version of MPICH?
>>>>
>>>>> 9a10,67
>>>>>> ==31141== Syscall param writev(vector[...]) points to uninitialised byte(s)
>>>>>> ==31141== at 0xD164CDB: writev (writev.c:51)
>>>>>> ==31141== by 0xCB13596: MPL_large_writev (mplsock.c:32)
>>>>>> ==31141== by 0xCB02DC9: MPIDU_Sock_writev (sock_immed.i:610)
>>>>>> ==31141== by 0xCAC00DD: MPIDI_CH3_iStartMsgv (ch3_istartmsgv.c:110)
>>>>>> ==31141== by 0xCA7DC91: recv_rma_msg (ch3u_rma_sync.c:2198)
>>>>>> ==31141== by 0xCA79CB6: MPIDI_Win_fence (ch3u_rma_sync.c:1295)
>>>>>> ==31141== by 0xC9AD840: PMPI_Win_fence (win_fence.c:111)
>>>>>> ==31141== by 0x51DA7E1: PetscSFRestoreWindow (sfwindow.c:348)
>>>>>> ==31141== by 0x51DD0C9: PetscSFBcastEnd_Window (sfwindow.c:510)
>>>>>> ==31141== by 0x51D0B66: PetscSFBcastEnd (sf.c:1001)
>>>>>> ==31141== by 0x401EB7: main (ex2.c:81)
>>>>>> ==31141== Address 0xe50d12c is 108 bytes inside a block of size 208 alloc'd
>>>>>> ==31141== at 0x4C2B6CD: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
>>>>>> ==31141== by 0xCA7328D: MPIDI_CH3I_RMA_Ops_alloc_tail (mpidrma.h:191)
>>>>>> ==31141== by 0xCA75A49: MPIDI_Get (ch3u_rma_ops.c:290)
>>>>>> ==31141== by 0xC99EC9C: PMPI_Get (get.c:142)
>>>>>> ==31141== by 0x51DCC62: PetscSFBcastBegin_Window (sfwindow.c:495)
>>>>>> ==31141== by 0x51D03C4: PetscSFBcastBegin (sf.c:968)
>>>>>> ==31141== by 0x401DDB: main (ex2.c:79)
>>>>>> ==31141== Uninitialised value was created by a heap allocation
>>>>>> ==31141== at 0x4C2B6CD: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
>>>>>> ==31141== by 0xCA7328D: MPIDI_CH3I_RMA_Ops_alloc_tail (mpidrma.h:191)
>>>>>> ==31141== by 0xCA75A49: MPIDI_Get (ch3u_rma_ops.c:290)
>>>>>> ==31141== by 0xC99EC9C: PMPI_Get (get.c:142)
>>>>>> ==31141== by 0x51DCC62: PetscSFBcastBegin_Window (sfwindow.c:495)
>>>>>> ==31141== by 0x51D03C4: PetscSFBcastBegin (sf.c:968)
>>>>>> ==31141== by 0x401DDB: main (ex2.c:79)
>>>>>> ==31141==
>>>>>> ==31142== Syscall param writev(vector[...]) points to uninitialised byte(s)
>>>>>> ==31142== at 0xD164CDB: writev (writev.c:51)
>>>>>> ==31142== by 0xCB13596: MPL_large_writev (mplsock.c:32)
>>>>>> ==31142== by 0xCB02DC9: MPIDU_Sock_writev (sock_immed.i:610)
>>>>>> ==31142== by 0xCAC00DD: MPIDI_CH3_iStartMsgv (ch3_istartmsgv.c:110)
>>>>>> ==31142== by 0xCA7DC91: recv_rma_msg (ch3u_rma_sync.c:2198)
>>>>>> ==31142== by 0xCA79CB6: MPIDI_Win_fence (ch3u_rma_sync.c:1295)
>>>>>> ==31142== by 0xC9AD840: PMPI_Win_fence (win_fence.c:111)
>>>>>> ==31142== by 0x51DA7E1: PetscSFRestoreWindow (sfwindow.c:348)
>>>>>> ==31142== by 0x51DD0C9: PetscSFBcastEnd_Window (sfwindow.c:510)
>>>>>> ==31142== by 0x51D0B66: PetscSFBcastEnd (sf.c:1001)
>>>>>> ==31142== by 0x401EB7: main (ex2.c:81)
>>>>>> ==31142== Address 0xe49d88c is 108 bytes inside a block of size 208 alloc'd
>>>>>> ==31142== at 0x4C2B6CD: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
>>>>>> ==31142== by 0xCA7328D: MPIDI_CH3I_RMA_Ops_alloc_tail (mpidrma.h:191)
>>>>>> ==31142== by 0xCA75A49: MPIDI_Get (ch3u_rma_ops.c:290)
>>>>>> ==31142== by 0xC99EC9C: PMPI_Get (get.c:142)
>>>>>> ==31142== by 0x51DCC62: PetscSFBcastBegin_Window (sfwindow.c:495)
>>>>>> ==31142== by 0x51D03C4: PetscSFBcastBegin (sf.c:968)
>>>>>> ==31142== by 0x401DDB: main (ex2.c:79)
>>>>>> ==31142== Uninitialised value was created by a heap allocation
>>>>>> ==31142== at 0x4C2B6CD: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
>>>>>> ==31142== by 0xCA7328D: MPIDI_CH3I_RMA_Ops_alloc_tail (mpidrma.h:191)
>>>>>> ==31142== by 0xCA75A49: MPIDI_Get (ch3u_rma_ops.c:290)
>>>>>> ==31142== by 0xC99EC9C: PMPI_Get (get.c:142)
>>>>>> ==31142== by 0x51DCC62: PetscSFBcastBegin_Window (sfwindow.c:495)
>>>>>> ==31142== by 0x51D03C4: PetscSFBcastBegin (sf.c:968)
>>>>>> ==31142== by 0x401DDB: main (ex2.c:79)
>>>>>> ==31142==
>>>>> /sandbox/petsc/petsc.clone/src/vec/is/sf/examples/tutorials
>>>>> Possible problem with ex2_window, diffs above
>>>>
>>>
>>>
>>
>>
>
More information about the petsc-dev
mailing list