[petsc-dev] valgrind question

Satish Balay balay at mcs.anl.gov
Wed Mar 2 16:22:44 CST 2016


Also suggest using mpich-3.1.3 - as latest mpich is not valgrind clean.

Satish

On Wed, 2 Mar 2016, Barry Smith wrote:

> 
>    When you configured MPICH did you use the flag --enable-g=meminit so it would not generate its own valgrind errors?
> 
>   Barry
> 
> > On Mar 2, 2016, at 4:11 PM, Xiaoye S. Li <xsli at lbl.gov> wrote:
> > 
> > I check that file, it also show not stripped. Not sure why it doesn't work.  Now I am using static library build to run valgrind, which works fine.
> > 
> > Now on to the valgrind output, I see quite a few warnings are unnecessary. For example,
> > 
> > ==13292== Conditional jump or move depends on uninitialised value(s)
> > ==13292==    at 0x5452D86: MPIC_Waitall (in /home/xiaoye/mpich-install/lib/libmpi.so.12.1.0)
> > ==13292==    by 0x53AB23F: MPIR_Alltoall_intra (in /home/xiaoye/mpich-install/lib/libmpi.so.12.1.0)
> > ==13292==    by 0x53ABFD4: MPIR_Alltoall (in /home/xiaoye/mpich-install/lib/libmpi.so.12.1.0)
> > ==13292==    by 0x53AC08D: MPIR_Alltoall_impl (in /home/xiaoye/mpich-install/lib/libmpi.so.12.1.0)
> > ==13292==    by 0x53AC896: PMPI_Alltoall (in /home/xiaoye/mpich-install/lib/libmpi.so.12.1.0)
> > ==13292==    by 0x418161: dReDistribute_A (pddistribute.c:108)
> > ==13292==    by 0x41950B: pddistribute (pddistribute.c:450)
> > ==13292==    by 0x407D6A: pdgssvx (pdgssvx.c:1080)
> > ==13292==    by 0x4027E5: main (pddrive.c:171)
> > 
> > The line in pddistribute.c: 108 is this:
> > 
> >     MPI_Alltoall( nnzToSend, 1, mpi_int_t, nnzToRecv, 1, mpi_int_t,
> >           grid->comm);
> > 
> > For both buffers nnzToSend and nnzToRecv, I use "calloc" version to allocate memory, i.e., malloc first, followed by zeroing the buffer.   
> > mpi_int_t is defined as MPI_INT.
> > Why does it complain about uninitialized values?
> > 
> > 
> > Sherry
> > 
> > 
> > 
> > 
> > On Tue, Mar 1, 2016 at 8:27 PM, Satish Balay <balay at mcs.anl.gov> wrote:
> > sometimes 'cmake' does a 'strip' during install of the library [which
> > can delete the debug symbols]. We had to track this down for one of
> > the cmake packages. I don't remember what we did to workarround it..
> > 
> > >>
> > petsc at es:/scratch/petsc/petsc/arch-linux-pkgs-valgrind/lib$ file libsuperlu_dist.so.5.0.0
> > libsuperlu_dist.so.5.0.0: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, not stripped
> > <<
> > 
> > looks like superlu_dist installed by petsc is not stripped. Perhaps
> > you can try:
> > 
> > file /home/xiaoye/Dropbox/Codes/SuperLU/superlu_dist.git/lib/libsuperlu_dist.so.5.0.0
> > 
> > Satish
> > 
> > On Tue, 1 Mar 2016, Barry Smith wrote:
> > 
> > >
> > >   Satish will know far better than me. I only use Linux when my Mac OS fails me :-(
> > >
> > >
> > > > On Mar 1, 2016, at 8:41 PM, Xiaoye S. Li <xsli at lbl.gov> wrote:
> > > >
> > > > This is on linux (ubunto).  I did compile with -g, but only the example driver (which is outside library) shows the line number, the routine in the *.so does not show line number, see this:
> > > >
> > > > ==31609== Conditional jump or move depends on uninitialised value(s)
> > > > ==31609==    at 0x51EED86: MPIC_Waitall (in /home/xiaoye/mpich-install/lib/libmpi.so.12.1.0)
> > > > ==31609==    by 0x5148F99: MPIR_Alltoallv_intra (in /home/xiaoye/mpich-install/lib/libmpi.so.12.1.0)
> > > > ==31609==    by 0x5149916: MPIR_Alltoallv (in /home/xiaoye/mpich-install/lib/libmpi.so.12.1.0)
> > > > ==31609==    by 0x51499F6: MPIR_Alltoallv_impl (in /home/xiaoye/mpich-install/lib/libmpi.so.12.1.0)
> > > > ==31609==    by 0x514A0C7: PMPI_Alltoallv (in /home/xiaoye/mpich-install/lib/libmpi.so.12.1.0)
> > > > ==31609==    by 0x4E7C56A: pdCompRow_loc_to_CompCol_global (in /home/xiaoye/Dropbox/Codes/SuperLU/superlu_dist.git/lib/libsupe\
> > > > rlu_dist.so.5.0.0)
> > > > ==31609==    by 0x4E71761: pdgssvx (in /home/xiaoye/Dropbox/Codes/SuperLU/superlu_dist.git/lib/libsuperlu_dist.so.5.0.0)
> > > > ==31609==    by 0x401400: main (pddrive.c:171)
> > > >
> > > >
> > > > Here are the flags:
> > > >
> > > > C_FLAGS =  -DUSE_VENDOR_BLAS -DAdd_ -DDEBUGlevel=0 -DPRNTlevel=0 -std=c99 -g -fPIC -I/home/xiaoye/Dropbox/Codes/SuperLU/superl\
> > > > u_dist.git/SRC -I/home/xiaoye/lib/parmetis-4.0.3/include -I/home/xiaoye/lib/parmetis-4.0.3/metis/include -I/home/xiaoye/mpich-\
> > > > install/include
> > > >
> > > >
> > > > Any idea?
> > > > Sherry
> > > >
> > > >
> > > > On Tue, Mar 1, 2016 at 6:00 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> > > >
> > > > > On Mar 1, 2016, at 7:41 PM, Xiaoye S. Li <xsli at lbl.gov> wrote:
> > > > >
> > > > > Barry,
> > > > >
> > > > > I am cleaning up the valgrind errors. I did a build with shared library option, but valgrind doesn't give me the source code line number.  Is it true that I need to build as static library?
> > > >
> > > >   No but if you are running on an Apple you may need the additional valgrind option --dsymutil=yes  (yes it is totally goofy that it doesn't just do this automatically). Also, of course, the source code needs be compiled with the -g option.
> > > >
> > > >   Barry
> > > >
> > > > >
> > > > > Sherry
> > > > >
> > > >
> > > >
> > >
> > >
> > 
> > 
> 
> 




More information about the petsc-dev mailing list