[mpich-discuss] Problem with gdb

Darius Buntinas buntinas at mcs.anl.gov
Tue Jul 29 10:23:28 CDT 2008


I'm not sure about the -dbg issues, but the valgrind messages about 
writev and write pointing to uninitialized data are (most likely) 
harmless.  If you configure with --enable-g=meminit, you should see the 
messages go away, but note that this option will affect performance.

For those interested in details:   The MPICH2 packet header is a union 
of headers for various packet types.  Depending on which packet type is 
being sent, some parts of the header will not be used.  MPICH2 
specifically doesn't initialize the unused parts of the header for 
performance reasons.  When valgrind sees that writev is called to send a 
buffer where some parts have not been initialized, it gives a warning. 
However, once the receiver reads the packet type, it knows which parts 
of the rest of the header are valid, so it will never read the 
uninitialized parts.

With the --enable-g=meminit option enabled, whenever a buffer is 
allocated, MPICH2 will initialize the entire buffer (to some non-zero 
value).  This is used for debugging purposes to help catch cases where 
we're reading an uninitialized field, and to squash valgrind messages 
like the ones you mentioned.

-d

On 07/29/2008 08:44 AM, Burlen Loring wrote:
> Hi all,
> 
> Marcelo Tomim wrote:
>> I have been trying to debug a program, but it freezes every time I try 
>> to run it in debug mode.
>> This is what I type:
>>  mpiexec -gdb -n 3 ./my_program
>>  
>> Then it does not return the command line and I have to kill it.
>>   
> I can confirm this. The same thing recently started happening to me. I 
> guessed it was related to a recent kernel/system library upgrades.
> 
> Linux tycho 2.6.24-19-generic #1 SMP Fri Jul 11 23:41:49 UTC 2008 i686 
> GNU/Linux
> 
> Also seemingly coincident with recent system upgrades, when I run codes 
> through valgrind I see a lot of complaints about using uninitialized 
> memory. Below is a simple program and it's valgrind output. Notice, that 
> there is a lot of lines like:
> 
> Syscall param writev(vector[...]) points to uninitialised byte(s)
> 
> These seemed to show up about the same time that -gdb option stopped 
> working. I think it has to do with system library change and not mpich 
> but I can't be sure. I am using my own source install of mpich 2 and I 
> have rebuilt since the upgrades but it didn't help.
> 
> I'd certainly like to get the -gdb option back, and the memory issues 
> cleaned up as well.
> 
> Thanks
> Burlen
> 
> // begin of mpivg.cpp
> #include <mpi.h>
> #include <iostream>
> 
> int main(int argc, char **argv)
> {
>  MPI_Init(&argc,&argv);
>  MPI_Barrier(MPI_COMM_WORLD);
>  MPI_Finalize();
>  return 0;
> }
> // end of mpivg.cpp
> 
> tycho:~/ext/kitware_cvs/Work/Burlen/testing$mpiexec -n 2 valgrind ./mpivg
> ==2031== Memcheck, a memory error detector.
> ==2031== Copyright (C) 2002-2007, and GNU GPL'd, by Julian Seward et al.
> ==2031== Using LibVEX rev 1804, a library for dynamic binary translation.
> ==2031== Copyright (C) 2004-2007, and GNU GPL'd, by OpenWorks LLP.
> ==2031== Using valgrind-3.3.0-Debian, a dynamic binary instrumentation 
> framework.
> ==2031== Copyright (C) 2000-2007, and GNU GPL'd, by Julian Seward et al.
> ==2031== For more details, rerun with: -v
> ==2030== Memcheck, a memory error detector.
> ==2030== Copyright (C) 2002-2007, and GNU GPL'd, by Julian Seward et al.
> ==2030== Using LibVEX rev 1804, a library for dynamic binary translation.
> ==2030== Copyright (C) 2004-2007, and GNU GPL'd, by OpenWorks LLP.
> ==2030== Using valgrind-3.3.0-Debian, a dynamic binary instrumentation 
> framework.
> ==2030== Copyright (C) 2000-2007, and GNU GPL'd, by Julian Seward et al.
> ==2030== For more details, rerun with: -v
> ==2030==
> ==2031==
> ==2030== Syscall param writev(vector[...]) points to uninitialised byte(s)
> ==2030==    at 0x40007F2: (within /lib/ld-2.7.so)
> ==2030==    by 0x4130FBC: MPIDU_Sock_wait (in 
> /opt/mpich2-1.0.7/lib/libmpich.so.1.1)==2031== Syscall param 
> writev(vector[...]) points to uninitialised byte(s)
> ==2031==    at 0x40007F2: (within /lib/ld-2.7.so)
> ==2031==    by 0x4130FBC: MPIDU_Sock_wait (in 
> /opt/mpich2-1.0.7/lib/libmpich.so.1.1)
> ==2031==    by 0x40ABE2B: MPIDI_CH3I_Progress (in 
> /opt/mpich2-1.0.7/lib/libmpich.so.1.1)
> ==2031==    by 0x40EA978: MPIC_Wait (in 
> /opt/mpich2-1.0.7/lib/libmpich.so.1.1)
> ==2031==    by 0x40EAD71: MPIC_Sendrecv (in 
> /opt/mpich2-1.0.7/lib/libmpich.so.1.1)
> ==2031==    by 0x40A089D: MPIR_Barrier (in 
> /opt/mpich2-1.0.7/lib/libmpich.so.1.1)
> ==
> ==2030==    by 0x40ABE2B: MPIDI_CH3I_Progress (in 
> /opt/mpich2-1.0.7/lib/libmpich.so.1.1)
> ==2030==    by 0x40EA978: MPIC_Wait (in 
> /opt/mpich2-1.0.7/lib/libmpich.so.1.1)
> ==2030==    by 0x40EAD71: MPIC_Sendrecv (in 
> /opt/mpich2-1.0.7/lib/libmpich.so.1.1)
> ==2030==    by 0x40A089D: MPIR_Barrier (in 
> /opt/mpich2-1.0.7/lib/libmpich.so.1.1)
> ==2030==    by 0x40A0E4A: PMPI_Barrier (in 
> /opt/mpich2-1.0.7/lib/libmpich.so.1.1)
> ==2030==    by 0x804891B: main (in 
> /home/burlen/ext/kitware_cvs/Work/Burlen/testing/mpivg)
> ==2030==  Address 0x4457fa8 is 32 bytes inside a block of size 72 alloc'd
> ==2030==    at 0x4022AB8: malloc (vg_replace_malloc.c:207)
> ==2030==    by 0x40AE6F5: MPIDI_CH3I_Connection_alloc (in 
> /opt/mpich2-1.0.7/lib/libmpich.so.1.1)
> ==2030==    by 0x40AE97E: MPIDI_CH3I_Sock_connect (in 
> /opt/mpich2-1.0.7/lib/libmpich.so.1.1)
> ==2030==    by 0x40AEBB7: MPIDI_CH3I_VC_post_sockconnect (in 
> /opt/mpich2-1.0.7/lib/libmpich.so.1.1)
> ==2030==    by 0x40AAB0C: MPIDI_CH3I_VC_post_connect (in 
> /opt/mpich2-1.0.7/lib/libmpich.so.1.1)
> ==2030==    by 0x40A9472: MPIDI_CH3_iSend (in 
> /opt/mpich2-1.0.7/lib/libmpich.so.1.1)
> ==2030==    by 0x40F9CC5: MPID_Isend (in 
> /opt/mpich2-1.0.7/lib/libmpich.so.1.1)
> ==2030==    by 0x40EAD39: MPIC_Sendrecv (in 
> /opt/mpich2-1.0.7/lib/libmpich.so.1.1)
> ==2030==    by 0x40A089D: MPIR_Barrier (in 
> /opt/mpich2-1.0.7/lib/libmpich.so.1.1)
> ==2030==    by 0x40A0E4A: PMPI_Barrier (in 
> /opt/mpich2-1.0.7/lib/libmpich.so.1.1)
> ==2030==    by 0x804891B: main (in 
> /home/burlen/ext/kitware_cvs/Work/Burlen/testing/mpivg)
> 2031==    by 0x40A0E4A: PMPI_Barrier (in 
> /opt/mpich2-1.0.7/lib/libmpich.so.1.1)
> ==2031==    by 0x804891B: main (in 
> /home/burlen/ext/kitware_cvs/Work/Burlen/testing/mpivg)
> ==2031==  Address 0x4457fa8 is 32 bytes inside a block of size 72 alloc'd
> ==2031==    at 0x4022AB8: malloc (vg_replace_malloc.c:207)
> ==2031==    by 0x40AE6F5: MPIDI_CH3I_Connection_alloc (in 
> /opt/mpich2-1.0.7/lib/libmpich.so.1.1)
> ==2031==    by 0x40AE97E: MPIDI_CH3I_Sock_connect (in 
> /opt/mpich2-1.0.7/lib/libmpich.so.1.1)
> ==2031==    by 0x40AEBB7: MPIDI_CH3I_VC_post_sockconnect (in 
> /opt/mpich2-1.0.7/lib/libmpich.so.1.1)
> ==2031==    by 0x40AAB0C: MPIDI_CH3I_VC_post_connect (in 
> /opt/mpich2-1.0.7/lib/libmpich.so.1.1)
> ==2031==    by 0x40A9472: MPIDI_CH3_iSend (in 
> /opt/mpich2-1.0.7/lib/libmpich.so.1.1)
> ==2031==    by 0x40F9CC5: MPID_Isend (in 
> /opt/mpich2-1.0.7/lib/libmpich.so.1.1)
> ==2031==    by 0x40EAD39: MPIC_Sendrecv (in 
> /opt/mpich2-1.0.7/lib/libmpich.so.1.1)
> ==2031==    by 0x40A089D: MPIR_Barrier (in 
> /opt/mpich2-1.0.7/lib/libmpich.so.1.1)
> ==2031==    by 0x40A0E4A: PMPI_Barrier (in 
> /opt/mpich2-1.0.7/lib/libmpich.so.1.1)
> ==2031==    by 0x804891B: main (in 
> /home/burlen/ext/kitware_cvs/Work/Burlen/testing/mpivg)
> ==2030==
> ==2030== Syscall param write(buf) points to uninitialised byte(s)
> ==2030==    at 0x40007F2: (within /lib/ld-2.7.so)
> ==2030==    by 0x40A950D: MPIDI_CH3_iSend (in 
> /opt/mpich2-1.0.7/lib/libmpich.so.1.1)
> ==2030==    by 0x40F9CC5: MPID_Isend (in 
> /opt/mpich2-1.0.7/lib/libmpich.so.1.1)
> ==2030==    by 0x40EAD39: MPIC_Sendrecv (in 
> /opt/mpich2-1.0.7/lib/libmpich.so.1.1)
> ==2030==    by 0x40A089D: MPIR_Barrier (in 
> /opt/mpich2-1.0.7/lib/libmpich.so.1.1)
> ==2030==    by 0x40A0E4A: PMPI_Barrier (in 
> /opt/mpich2-1.0.7/lib/libmpich.so.1.1)
> ==2030==    by 0x40F82DF: MPID_Finalize (in 
> /opt/mpich2-1.0.7/lib/libmpich.so.1.1)
> ==2030==    by 0x40D8FD4: PMPI_Finalize (in 
> /opt/mpich2-1.0.7/lib/libmpich.so.1.1)
> ==2030==    by 0x8048920: main (in 
> /home/burlen/ext/kitware_cvs/Work/Burlen/testing/mpivg)
> ==2030==  Address 0xbec94fdc is on thread 1's stack
> ==2031==
> ==2031== Syscall param write(buf) points to uninitialised byte(s)
> ==2031==    at 0x40007F2: (within /lib/ld-2.7.so)
> ==2031==    by 0x40A950D: MPIDI_CH3_iSend (in 
> /opt/mpich2-1.0.7/lib/libmpich.so.1.1)
> ==2031==    by 0x40F9CC5: MPID_Isend (in 
> /opt/mpich2-1.0.7/lib/libmpich.so.1.1)
> ==2031==    by 0x40EAD39: MPIC_Sendrecv (in 
> /opt/mpich2-1.0.7/lib/libmpich.so.1.1)
> ==2031==    by 0x40A089D: MPIR_Barrier (in 
> /opt/mpich2-1.0.7/lib/libmpich.so.1.1)
> ==2031==    by 0x40A0E4A: PMPI_Barrier (in 
> /opt/mpich2-1.0.7/lib/libmpich.so.1.1)
> ==2031==    by 0x40F82DF: MPID_Finalize (in 
> /opt/mpich2-1.0.7/lib/libmpich.so.1.1)
> ==2031==    by 0x40D8FD4: PMPI_Finalize (in 
> /opt/mpich2-1.0.7/lib/libmpich.so.1.1)
> ==2031==    by 0x8048920: main (in 
> /home/burlen/ext/kitware_cvs/Work/Burlen/testing/mpivg)
> ==2031==  Address 0xbeb6afdc is on thread 1's stack
> ==2030==
> ==2030== Syscall param write(buf) points to uninitialised byte(s)
> ==2030==    at 0x40007F2: (within /lib/ld-2.7.so)
> ==2030==    by 0x40AA244: MPIDI_CH3_iStartMsg (in 
> /opt/mpich2-1.0.7/lib/libmpich.so.1.1)
> ==2030==    by 0x40B0F86: MPIDI_CH3U_VC_SendClose (in 
> /opt/mpich2-1.0.7/lib/libmpich.so.1.1)
> ==2030==    by 0x4101B7F: MPIDI_PG_Close_VCs (in 
> /opt/mpich2-1.0.7/lib/libmpich.so.1.1)
> ==2030==    by 0x40F8451: MPID_Finalize (in 
> /opt/mpich2-1.0.7/lib/libmpich.so.1.1)
> ==2030==    by 0x40D8FD4: PMPI_Finalize (in 
> /opt/mpich2-1.0.7/lib/libmpich.so.1.1)
> ==2030==    by 0x8048920: main (in 
> /home/burlen/ext/kitware_cvs/Work/Burlen/testing/mpivg)
> ==2030==  Address 0xbec95084 is on thread 1's stack
> ==2031==
> ==2031== Syscall param write(buf) points to uninitialised byte(s)
> ==2031==    at 0x40007F2: (within /lib/ld-2.7.so)
> ==2031==    by 0x40AA244: MPIDI_CH3_iStartMsg (in 
> /opt/mpich2-1.0.7/lib/libmpich.so.1.1)
> ==2031==    by 0x40B0F86: MPIDI_CH3U_VC_SendClose (in 
> /opt/mpich2-1.0.7/lib/libmpich.so.1.1)
> ==2031==    by 0x4101B7F: MPIDI_PG_Close_VCs (in 
> /opt/mpich2-1.0.7/lib/libmpich.so.1.1)
> ==2031==    by 0x40F8451: MPID_Finalize (in 
> /opt/mpich2-1.0.7/lib/libmpich.so.1.1)
> ==2031==    by 0x40D8FD4: PMPI_Finalize (in 
> /opt/mpich2-1.0.7/lib/libmpich.so.1.1)
> ==2031==    by 0x8048920: main (in 
> /home/burlen/ext/kitware_cvs/Work/Burlen/testing/mpivg)
> ==2031==  Address 0xbeb6b084 is on thread 1's stack
> ==2030==
> ==2030== Syscall param write(buf) points to uninitialised byte(s)
> ==2030==    at 0x40007F2: (within /lib/ld-2.7.so)
> ==2030==    by 0x40AA244: MPIDI_CH3_iStartMsg (in 
> /opt/mpich2-1.0.7/lib/libmpich.so.1.1)
> ==2030==    by 0x40B1181: MPIDI_CH3_PktHandler_Close (in 
> /opt/mpich2-1.0.7/lib/libmpich.so.1.1)
> ==2030==    by 0x40AB8E0: MPIDI_CH3I_Progress_handle_sock_event (in 
> /opt/mpich2-1.0.7/lib/libmpich.so.1.1)
> ==2030==    by 0x40ABDE5: MPIDI_CH3I_Progress (in 
> /opt/mpich2-1.0.7/lib/libmpich.so.1.1)
> ==2030==    by 0x40B0E1C: MPIDI_CH3U_VC_WaitForClose (in 
> /opt/mpich2-1.0.7/lib/libmpich.so.1.1)
> ==2030==    by 0x40F8456: MPID_Finalize (in 
> /opt/mpich2-1.0.7/lib/libmpich.so.1.1)
> ==2030==    by 0x40D8FD4: PMPI_Finalize (in 
> /opt/mpich2-1.0.7/lib/libmpich.so.1.1)
> ==2030==    by 0x8048920: main (in 
> /home/burlen/ext/kitware_cvs/Work/Burlen/testing/mpivg)
> ==2030==  Address 0xbec94fa4 is on thread 1's stack
> ==2031==
> ==2031== Syscall param write(buf) points to uninitialised byte(s)
> ==2031==    at 0x40007F2: (within /lib/ld-2.7.so)
> ==2031==    by 0x40AA244: MPIDI_CH3_iStartMsg (in 
> /opt/mpich2-1.0.7/lib/libmpich.so.1.1)
> ==2031==    by 0x40B1181: MPIDI_CH3_PktHandler_Close (in 
> /opt/mpich2-1.0.7/lib/libmpich.so.1.1)
> ==2031==    by 0x40AB8E0: MPIDI_CH3I_Progress_handle_sock_event (in 
> /opt/mpich2-1.0.7/lib/libmpich.so.1.1)
> ==2031==    by 0x40ABDE5: MPIDI_CH3I_Progress (in 
> /opt/mpich2-1.0.7/lib/libmpich.so.1.1)
> ==2031==    by 0x40B0E1C: MPIDI_CH3U_VC_WaitForClose (in 
> /opt/mpich2-1.0.7/lib/libmpich.so.1.1)
> ==2031==    by 0x40F8456: MPID_Finalize (in 
> /opt/mpich2-1.0.7/lib/libmpich.so.1.1)
> ==2031==    by 0x40D8FD4: PMPI_Finalize (in 
> /opt/mpich2-1.0.7/lib/libmpich.so.1.1)
> ==2031==    by 0x8048920: main (in 
> /home/burlen/ext/kitware_cvs/Work/Burlen/testing/mpivg)
> ==2031==  Address 0xbeb6afa4 is on thread 1's stack
> ==2031==
> ==2031== ERROR SUMMARY: 5 errors from 4 contexts (suppressed: 27 from 1)
> ==2031== malloc/free: in use at exit: 0 bytes in 0 blocks.
> ==2031== malloc/free: 55 allocs, 55 frees, 35,225 bytes allocated.
> ==2031== For counts of detected errors, rerun with: -v
> ==2031== All heap blocks were freed -- no leaks are possible.
> ==2030==
> ==2030== ERROR SUMMARY: 5 errors from 4 contexts (suppressed: 27 from 1)
> ==2030== malloc/free: in use at exit: 0 bytes in 0 blocks.
> ==2030== malloc/free: 55 allocs, 55 frees, 35,225 bytes allocated.
> ==2030== For counts of detected errors, rerun with: -v
> ==2030== All heap blocks were freed -- no leaks are possible.
> 
> 




More information about the mpich-discuss mailing list