[petsc-dev] Valgrind MPI-Related Errors

Jacob Faibussowitsch jacob.fai at gmail.com
Mon Jun 1 19:43:30 CDT 2020


Hello All,

TL;DR: valgrind always complains about "Syscall param write(buf) points to uninitialised byte(s)” for a LOT of MPI operations in petsc code, making debugging using valgrind fairly annoying since I have to sort through a ton of unrelated stuff. I have built valgrind from source, used apt install valgrind, apt install valgrind-mpi to no avail.

I am using valgrind from docker. Dockerfile is attached below as well. I have been unsuccessfully trying to resolve these local valgrind errors, but I am running out of ideas. Googling the issue has also not provided entirely applicable solutions. Here is an example of the error:

$ make -f gmakefile test VALGRIND=1
...
#	==54610== Syscall param write(buf) points to uninitialised byte(s)
#	==54610==    at 0x6F63317: write (write.c:26)
#	==54610==    by 0x9056AC9: MPIDI_CH3I_Sock_write (in /usr/local/lib/libmpi.so.12.1.8)
#	==54610==    by 0x9059FCD: MPIDI_CH3_iStartMsg (in /usr/local/lib/libmpi.so.12.1.8)
#	==54610==    by 0x903F298: MPIDI_CH3_EagerContigShortSend (in /usr/local/lib/libmpi.so.12.1.8)
#	==54610==    by 0x9049479: MPID_Send (in /usr/local/lib/libmpi.so.12.1.8)
#	==54610==    by 0x8FC9B2A: MPIC_Send (in /usr/local/lib/libmpi.so.12.1.8)
#	==54610==    by 0x8F86F2E: MPIR_Bcast_intra_binomial (in /usr/local/lib/libmpi.so.12.1.8)
#	==54610==    by 0x8EE204E: MPIR_Bcast_intra_auto (in /usr/local/lib/libmpi.so.12.1.8)
#	==54610==    by 0x8EE21F4: MPIR_Bcast_impl (in /usr/local/lib/libmpi.so.12.1.8)
#	==54610==    by 0x8F887FB: MPIR_Bcast_intra_smp (in /usr/local/lib/libmpi.so.12.1.8)
#	==54610==    by 0x8EE206E: MPIR_Bcast_intra_auto (in /usr/local/lib/libmpi.so.12.1.8)
#	==54610==    by 0x8EE21F4: MPIR_Bcast_impl (in /usr/local/lib/libmpi.so.12.1.8)
#	==54610==    by 0x8EE2A6F: PMPI_Bcast (in /usr/local/lib/libmpi.so.12.1.8)
#	==54610==    by 0x4B377B8: PetscOptionsInsertFile (options.c:525)
#	==54610==    by 0x4B39291: PetscOptionsInsert (options.c:672)
#	==54610==    by 0x4B5B1EF: PetscInitialize (pinit.c:996)
#	==54610==    by 0x10A6BA: main (ex9.c:75)
#	==54610==  Address 0x1ffeffa944 is on thread 1's stack
#	==54610==  in frame #3, created by MPIDI_CH3_EagerContigShortSend (???:)
#	==54610==  Uninitialised value was created by a stack allocation
#	==54610==    at 0x903F200: MPIDI_CH3_EagerContigShortSend (in /usr/local/lib/libmpi.so.12.1.8)

There are probably 20 such errors every single time, regardless of what code is being run. I have tried using apt install valgrind, apt install valgrind-mpi, and building valgrind from source:

# VALGRIND                                                                                          
WORKDIR /
RUN git clone git://sourceware.org/git/valgrind.git
WORKDIR /valgrind
RUN git pull
RUN ./autogen.sh
RUN ./configure --with-mpicc=/usr/local/bin/mpicc
RUN make -j 5
RUN make install 

None of the those approaches lead to these errors disappearing. Perhaps I am missing some funky MPI args?

Best regards,

Jacob Faibussowitsch
(Jacob Fai - booss - oh - vitch)
Cell: (312) 694-3391

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20200601/a17c478b/attachment-0002.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Dockerfile
Type: application/octet-stream
Size: 1622 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20200601/a17c478b/attachment-0001.obj>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20200601/a17c478b/attachment-0003.html>


More information about the petsc-dev mailing list