[petsc-dev] use of uninitialized values in PetscSF

Satish Balay balay at mcs.anl.gov
Fri Sep 18 22:54:54 CDT 2015


2 options I could check.

- switch to openmpi - and see if thats valgrind clean
- create a suppression file for mpich - for this error..
[bin/maint/petsc-val.supp - has an entry for a gfortran isse.
don't know if that issue still exists]

Satish

On Fri, 18 Sep 2015, Satish Balay wrote:

> I just tried a build with
> http://www.mpich.org/static/downloads/3.2b4/mpich-3.2b4.tar.gz
> and still see the errors.
> 
> Satish
> 
> ----------
> balay at es^/scratch/balay/petsc/src/vec/is/sf/examples/tutorials(master=) $ /scratch/balay/petsc/arch-linux2-c-debug/bin/mpiexec -n 2 valgrind --tool=memcheck -q ./ex2 -sf_type window
> PetscSF Object: 2 MPI processes
>   type: window
>     synchronization=FENCE sort=rank-order
>   [0] Number of roots=1, leaves=2, remote ranks=2
>   [0] 0 <- (0,0)
>   [0] 1 <- (1,0)
>   [1] Number of roots=1, leaves=2, remote ranks=2
>   [1] 0 <- (1,0)
>   [1] 1 <- (0,0)
> ==13170== Syscall param writev(vector[...]) points to uninitialised byte(s)
> ==13170==    at 0xA2EFD4B: writev (writev.c:51)
> ==13170==    by 0x9C7D432: MPL_large_writev (mplsock.c:32)
> ==13170==    by 0x9C6CD24: MPIDU_Sock_writev (sock_immed.i:610)
> ==13170==    by 0x9C26793: MPIDI_CH3_iStartMsgv (ch3_istartmsgv.c:110)
> ==13170==    by 0x9BB0184: issue_get_op (mpid_rma_issue.h:991)
> ==13170==    by 0x9BB0923: issue_rma_op (mpid_rma_issue.h:1221)
> ==13170==    by 0x9BB273F: issue_ops_target (ch3u_rma_progress.c:392)
> ==13170==    by 0x9BB2DFB: issue_ops_win (ch3u_rma_progress.c:544)
> ==13170==    by 0x9BB3FA4: MPIDI_CH3I_RMA_Make_progress_global (ch3u_rma_progress.c:972)
> ==13170==    by 0x9C27A16: MPIDI_CH3i_Progress_wait (ch3_progress.c:192)
> ==13170==    by 0x9C292B6: MPIDI_CH3I_Progress (ch3_progress.c:948)
> ==13170==    by 0x9B34599: MPIC_Wait (helper_fns.c:225)
> ==13170==  Address 0xc147574 is 4 bytes inside a block of size 168 alloc'd
> ==13170==    at 0x4C2B6CD: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
> ==13170==    by 0x9BAFFD5: issue_get_op (mpid_rma_issue.h:973)
> ==13170==    by 0x9BB0923: issue_rma_op (mpid_rma_issue.h:1221)
> ==13170==    by 0x9BB273F: issue_ops_target (ch3u_rma_progress.c:392)
> ==13170==    by 0x9BB2DFB: issue_ops_win (ch3u_rma_progress.c:544)
> ==13170==    by 0x9BB3FA4: MPIDI_CH3I_RMA_Make_progress_global (ch3u_rma_progress.c:972)
> ==13170==    by 0x9C27A16: MPIDI_CH3i_Progress_wait (ch3_progress.c:192)
> ==13170==    by 0x9C292B6: MPIDI_CH3I_Progress (ch3_progress.c:948)
> ==13170==    by 0x9B34599: MPIC_Wait (helper_fns.c:225)
> ==13170==    by 0x9B3491B: MPIC_Recv (helper_fns.c:355)
> ==13170==    by 0x99AE746: MPIR_Bcast_binomial (bcast.c:234)
> ==13170==    by 0x99B1309: MPIR_Bcast_intra (bcast.c:1283)
> ==13170== 
> ==13169== Syscall param writev(vector[...]) points to uninitialised byte(s)
> ==13169==    at 0xA2EFD4B: writev (writev.c:51)
> ==13169==    by 0x9C7D432: MPL_large_writev (mplsock.c:32)
> ==13169==    by 0x9C6CD24: MPIDU_Sock_writev (sock_immed.i:610)
> ==13169==    by 0x9C26793: MPIDI_CH3_iStartMsgv (ch3_istartmsgv.c:110)
> ==13169==    by 0x9BB0184: issue_get_op (mpid_rma_issue.h:991)
> ==13169==    by 0x9BB0923: issue_rma_op (mpid_rma_issue.h:1221)
> ==13169==    by 0x9BB273F: issue_ops_target (ch3u_rma_progress.c:392)
> ==13169==    by 0x9BB2DFB: issue_ops_win (ch3u_rma_progress.c:544)
> ==13169==    by 0x9BB3FA4: MPIDI_CH3I_RMA_Make_progress_global (ch3u_rma_progress.c:972)
> ==13169==    by 0x9C27A16: MPIDI_CH3i_Progress_wait (ch3_progress.c:192)
> ==13169==    by 0x9C292B6: MPIDI_CH3I_Progress (ch3_progress.c:948)
> ==13169==    by 0x9B34599: MPIC_Wait (helper_fns.c:225)
> ==13169==  Address 0xbcf90e4 is 4 bytes inside a block of size 168 alloc'd
> ==13169==    at 0x4C2B6CD: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
> ==13169==    by 0x9BAFFD5: issue_get_op (mpid_rma_issue.h:973)
> ==13169==    by 0x9BB0923: issue_rma_op (mpid_rma_issue.h:1221)
> ==13169==    by 0x9BB273F: issue_ops_target (ch3u_rma_progress.c:392)
> ==13169==    by 0x9BB2DFB: issue_ops_win (ch3u_rma_progress.c:544)
> ==13169==    by 0x9BB3FA4: MPIDI_CH3I_RMA_Make_progress_global (ch3u_rma_progress.c:972)
> ==13169==    by 0x9C27A16: MPIDI_CH3i_Progress_wait (ch3_progress.c:192)
> ==13169==    by 0x9C292B6: MPIDI_CH3I_Progress (ch3_progress.c:948)
> ==13169==    by 0x9B34599: MPIC_Wait (helper_fns.c:225)
> ==13169==    by 0x9B3491B: MPIC_Recv (helper_fns.c:355)
> ==13169==    by 0x99B8B44: MPIR_Reduce_binomial (reduce.c:181)
> ==13169==    by 0x99BAC95: MPIR_Reduce_intra (reduce.c:874)
> ==13169== 
> Vec Object: 2 MPI processes
>   type: mpi
> Process [0]
> 0
> 1
> Process [1]
> 1
> 0
> Vec Object: 2 MPI processes
>   type: mpi
> Process [0]
> 10
> 11
> Process [1]
> 11
> 10
> balay at es^/scratch/balay/petsc/src/vec/is/sf/examples/tutorials(master=) $ 
> 
> 
> On Fri, 18 Sep 2015, Jed Brown wrote:
> 
> > Barry Smith <bsmith at mcs.anl.gov> writes:
> > 
> > >   Is there any way to fix this use of uninitialized values?
> > 
> > Hmm, I don't think it's something we have any control over.  I could try
> > making a reduced test case to put in MPICH's test suite.  Is this with
> > the latest version of MPICH?
> > 
> > > 9a10,67
> > >> ==31141== Syscall param writev(vector[...]) points to uninitialised byte(s)
> > >> ==31141==    at 0xD164CDB: writev (writev.c:51)
> > >> ==31141==    by 0xCB13596: MPL_large_writev (mplsock.c:32)
> > >> ==31141==    by 0xCB02DC9: MPIDU_Sock_writev (sock_immed.i:610)
> > >> ==31141==    by 0xCAC00DD: MPIDI_CH3_iStartMsgv (ch3_istartmsgv.c:110)
> > >> ==31141==    by 0xCA7DC91: recv_rma_msg (ch3u_rma_sync.c:2198)
> > >> ==31141==    by 0xCA79CB6: MPIDI_Win_fence (ch3u_rma_sync.c:1295)
> > >> ==31141==    by 0xC9AD840: PMPI_Win_fence (win_fence.c:111)
> > >> ==31141==    by 0x51DA7E1: PetscSFRestoreWindow (sfwindow.c:348)
> > >> ==31141==    by 0x51DD0C9: PetscSFBcastEnd_Window (sfwindow.c:510)
> > >> ==31141==    by 0x51D0B66: PetscSFBcastEnd (sf.c:1001)
> > >> ==31141==    by 0x401EB7: main (ex2.c:81)
> > >> ==31141==  Address 0xe50d12c is 108 bytes inside a block of size 208 alloc'd
> > >> ==31141==    at 0x4C2B6CD: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
> > >> ==31141==    by 0xCA7328D: MPIDI_CH3I_RMA_Ops_alloc_tail (mpidrma.h:191)
> > >> ==31141==    by 0xCA75A49: MPIDI_Get (ch3u_rma_ops.c:290)
> > >> ==31141==    by 0xC99EC9C: PMPI_Get (get.c:142)
> > >> ==31141==    by 0x51DCC62: PetscSFBcastBegin_Window (sfwindow.c:495)
> > >> ==31141==    by 0x51D03C4: PetscSFBcastBegin (sf.c:968)
> > >> ==31141==    by 0x401DDB: main (ex2.c:79)
> > >> ==31141==  Uninitialised value was created by a heap allocation
> > >> ==31141==    at 0x4C2B6CD: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
> > >> ==31141==    by 0xCA7328D: MPIDI_CH3I_RMA_Ops_alloc_tail (mpidrma.h:191)
> > >> ==31141==    by 0xCA75A49: MPIDI_Get (ch3u_rma_ops.c:290)
> > >> ==31141==    by 0xC99EC9C: PMPI_Get (get.c:142)
> > >> ==31141==    by 0x51DCC62: PetscSFBcastBegin_Window (sfwindow.c:495)
> > >> ==31141==    by 0x51D03C4: PetscSFBcastBegin (sf.c:968)
> > >> ==31141==    by 0x401DDB: main (ex2.c:79)
> > >> ==31141== 
> > >> ==31142== Syscall param writev(vector[...]) points to uninitialised byte(s)
> > >> ==31142==    at 0xD164CDB: writev (writev.c:51)
> > >> ==31142==    by 0xCB13596: MPL_large_writev (mplsock.c:32)
> > >> ==31142==    by 0xCB02DC9: MPIDU_Sock_writev (sock_immed.i:610)
> > >> ==31142==    by 0xCAC00DD: MPIDI_CH3_iStartMsgv (ch3_istartmsgv.c:110)
> > >> ==31142==    by 0xCA7DC91: recv_rma_msg (ch3u_rma_sync.c:2198)
> > >> ==31142==    by 0xCA79CB6: MPIDI_Win_fence (ch3u_rma_sync.c:1295)
> > >> ==31142==    by 0xC9AD840: PMPI_Win_fence (win_fence.c:111)
> > >> ==31142==    by 0x51DA7E1: PetscSFRestoreWindow (sfwindow.c:348)
> > >> ==31142==    by 0x51DD0C9: PetscSFBcastEnd_Window (sfwindow.c:510)
> > >> ==31142==    by 0x51D0B66: PetscSFBcastEnd (sf.c:1001)
> > >> ==31142==    by 0x401EB7: main (ex2.c:81)
> > >> ==31142==  Address 0xe49d88c is 108 bytes inside a block of size 208 alloc'd
> > >> ==31142==    at 0x4C2B6CD: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
> > >> ==31142==    by 0xCA7328D: MPIDI_CH3I_RMA_Ops_alloc_tail (mpidrma.h:191)
> > >> ==31142==    by 0xCA75A49: MPIDI_Get (ch3u_rma_ops.c:290)
> > >> ==31142==    by 0xC99EC9C: PMPI_Get (get.c:142)
> > >> ==31142==    by 0x51DCC62: PetscSFBcastBegin_Window (sfwindow.c:495)
> > >> ==31142==    by 0x51D03C4: PetscSFBcastBegin (sf.c:968)
> > >> ==31142==    by 0x401DDB: main (ex2.c:79)
> > >> ==31142==  Uninitialised value was created by a heap allocation
> > >> ==31142==    at 0x4C2B6CD: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
> > >> ==31142==    by 0xCA7328D: MPIDI_CH3I_RMA_Ops_alloc_tail (mpidrma.h:191)
> > >> ==31142==    by 0xCA75A49: MPIDI_Get (ch3u_rma_ops.c:290)
> > >> ==31142==    by 0xC99EC9C: PMPI_Get (get.c:142)
> > >> ==31142==    by 0x51DCC62: PetscSFBcastBegin_Window (sfwindow.c:495)
> > >> ==31142==    by 0x51D03C4: PetscSFBcastBegin (sf.c:968)
> > >> ==31142==    by 0x401DDB: main (ex2.c:79)
> > >> ==31142== 
> > > /sandbox/petsc/petsc.clone/src/vec/is/sf/examples/tutorials
> > > Possible problem with ex2_window, diffs above
> > 
> 
> 




More information about the petsc-dev mailing list