[petsc-dev] simple way to use --db-attach with valgrind in parallel with mpiexec

Barry Smith bsmith at mcs.anl.gov
Sat Jun 30 19:33:03 CDT 2012


   Has anyone documented a simple way to run parallel programs with mpiexec and valgrind and use the --db-attach=yes option to use the debugger on a process to see what is going on?

   My simple minded attempt leads to uselessness; it tries to ask me about using the debugger but seems to be in tty hell.

    Thanks

    Barry


barry-smiths-macbook-pro:LuLu barrysmith$ ~/Src/petsc-dev/arch-gnu/bin/mpiexec -n 2 valgrind --tool=memcheck --db-attach=yes --dsymutil=yes --num-callers=20  -q  ./ex26 -ts_type beuler
--55788-- run: /usr/bin/dsymutil "./ex26"
--55789-- run: /usr/bin/dsymutil "./ex26"
error: No such file or directory - old dSYM file cannot be overwritten:		old: './ex26.dSYM/Contents/Resources/DWARF/ex26'
	new:'./ex26.dSYM/Contents/Resources/DWARF/ex26.x86_64'.
--55789-- run: /usr/bin/dsymutil "/Users/barrysmith/Src/petsc-dev/arch-gnu/lib/libpetsc.dylib"
--55788-- run: /usr/bin/dsymutil "/Users/barrysmith/Src/petsc-dev/arch-gnu/lib/libpetsc.dylib"
error: No such file or directory - old dSYM file cannot be overwritten:		old: '/Users/barrysmith/Src/petsc-dev/arch-gnu/lib/libpetsc.dylib.dSYM/Contents/Resources/DWARF/libpetsc.dylib'
	new:'/Users/barrysmith/Src/petsc-dev/arch-gnu/lib/libpetsc.dylib.dSYM/Contents/Resources/DWARF/libpetsc.dylib.x86_64'.
--55788-- run: /usr/bin/dsymutil "/Users/barrysmith/Src/petsc-dev/arch-gnu/lib/libparmetis.dylib"
--55789-- run: /usr/bin/dsymutil "/Users/barrysmith/Src/petsc-dev/arch-gnu/lib/libparmetis.dylib"
--55788-- run: /usr/bin/dsymutil "/Users/barrysmith/Src/petsc-dev/arch-gnu/lib/libmetis.dylib"
--55789-- run: /usr/bin/dsymutil "/Users/barrysmith/Src/petsc-dev/arch-gnu/lib/libmetis.dylib"

warning: (x86_64) /Users/barrysmith/Src/petsc-dev/externalpackages/mpich2-1.4.1p1/lib/.tmp/wtimef.o unable to open object file
warning: no debug symbols in executable (-arch x86_64)
4x4 grid, water viscosity = 1, oil vescosity # = 1,
 connate water saturation # = 0 irreducible oil saturation # =0
timestep 0 time 0
==55789== Use of uninitialised value of size 8
==55789==    at 0x100002B3C: Mobility (ex26.c:116)
==55789==    by 0x10000637F: FormIFunctionLocal (ex26.c:579)
==55789==    by 0x100007791: FormIFunction (ex26.c:680)
==55789==    by 0x100DACF67: TSComputeIFunction (ts.c:374)
==55789==    by 0x100DA3778: SNESTSFormFunction_Theta (theta.c:204)
==55789==    by 0x100DC2D27: SNESTSFormFunction (ts.c:2560)
==55789==    by 0x100CACE87: SNESComputeFunction (snes.c:1901)
==55789==    by 0x100CF7651: SNESSolve_LS (ls.c:161)
==55789==    by 0x100CBF982: SNESSolve (snes.c:3549)
==55789==    by 0x100DA223F: TSStep_Theta (theta.c:114)
==55789==    by 0x100DBC4B9: TSStep (ts.c:1827)
==55789==    by 0x100DBDA12: TSSolve (ts.c:1951)
==55789==    by 0x1000047A8: main (ex26.c:327)
==55789== 
==55789== 
==55789== ---- Attach to debugger ? --- [Return/N/n/Y/y/C/c] ---- ==55789== Invalid read of size 8
==55789==    at 0x100002B3C: Mobility (ex26.c:116)
==55789==    by 0x10000637F: FormIFunctionLocal (ex26.c:579)
==55789==    by 0x100007791: FormIFunction (ex26.c:680)
==55789==    by 0x100DACF67: TSComputeIFunction (ts.c:374)
==55789==    by 0x100DA3778: SNESTSFormFunction_Theta (theta.c:204)
==55789==    by 0x100DC2D27: SNESTSFormFunction (ts.c:2560)
==55789==    by 0x100CACE87: SNESComputeFunction (snes.c:1901)
==55789==    by 0x100CF7651: SNESSolve_LS (ls.c:161)
==55789==    by 0x100CBF982: SNESSolve (snes.c:3549)
==55789==    by 0x100DA223F: TSStep_Theta (theta.c:114)
==55789==    by 0x100DBC4B9: TSStep (ts.c:1827)
==55789==    by 0x100DBDA12: TSSolve (ts.c:1951)
==55789==    by 0x1000047A8: main (ex26.c:327)
==55789==  Address 0x8 is not stack'd, malloc'd or (recently) free'd
==55789== 
==55789== Conditional jump or move depends on uninitialised value(s)
==55789==    at 0x102738F7B: signal__ (in /usr/lib/libSystem.B.dylib)
==55789==    by 0x100181B19: PetscDefaultSignalHandler (signal.c:143)
==55789==    by 0x1001818AE: PetscSignalHandler_Private (signal.c:53)
==55789==    by 0x138048EF3: ???
==55789==    by 0x10000637F: FormIFunctionLocal (ex26.c:579)
==55789==    by 0x100007791: FormIFunction (ex26.c:680)
==55789==    by 0x100DACF67: TSComputeIFunction (ts.c:374)
==55789==    by 0x100DA3778: SNESTSFormFunction_Theta (theta.c:204)
==55789==    by 0x100DC2D27: SNESTSFormFunction (ts.c:2560)
==55789==    by 0x100CACE87: SNESComputeFunction (snes.c:1901)
==55789==    by 0x100CF7651: SNESSolve_LS (ls.c:161)
==55789==    by 0x100CBF982: SNESSolve (snes.c:3549)
==55789==    by 0x100DA223F: TSStep_Theta (theta.c:114)
==55789==    by 0x100DBC4B9: TSStep (ts.c:1827)
==55789==    by 0x100DBDA12: TSSolve (ts.c:1951)
==55789==    by 0x1000047A8: main (ex26.c:327)
==55789== 
==55789== Conditional jump or move depends on uninitialised value(s)
==55789==    at 0x102738FD7: sigaction (in /usr/lib/libSystem.B.dylib)
==55789==    by 0x102738FA6: signal__ (in /usr/lib/libSystem.B.dylib)
==55789==    by 0x100181B19: PetscDefaultSignalHandler (signal.c:143)
==55789==    by 0x1001818AE: PetscSignalHandler_Private (signal.c:53)
==55789==    by 0x138048EF3: ???
==55789==    by 0x10000637F: FormIFunctionLocal (ex26.c:579)
==55789==    by 0x100007791: FormIFunction (ex26.c:680)
==55789==    by 0x100DACF67: TSComputeIFunction (ts.c:374)
==55789==    by 0x100DA3778: SNESTSFormFunction_Theta (theta.c:204)
==55789==    by 0x100DC2D27: SNESTSFormFunction (ts.c:2560)
==55789==    by 0x100CACE87: SNESComputeFunction (snes.c:1901)
==55789==    by 0x100CF7651: SNESSolve_LS (ls.c:161)
==55789==    by 0x100CBF982: SNESSolve (snes.c:3549)
==55789==    by 0x100DA223F: TSStep_Theta (theta.c:114)
==55789==    by 0x100DBC4B9: TSStep (ts.c:1827)
==55789==    by 0x100DBDA12: TSSolve (ts.c:1951)
==55789==    by 0x1000047A8: main (ex26.c:327)
==55789== 
==55789== Conditional jump or move depends on uninitialised value(s)
==55789==    at 0x102738FDC: sigaction (in /usr/lib/libSystem.B.dylib)
==55789==    by 0x102738FA6: signal__ (in /usr/lib/libSystem.B.dylib)
==55789==    by 0x100181B19: PetscDefaultSignalHandler (signal.c:143)
==55789==    by 0x1001818AE: PetscSignalHandler_Private (signal.c:53)
==55789==    by 0x138048EF3: ???
==55789==    by 0x10000637F: FormIFunctionLocal (ex26.c:579)
==55789==    by 0x100007791: FormIFunction (ex26.c:680)
==55789==    by 0x100DACF67: TSComputeIFunction (ts.c:374)
==55789==    by 0x100DA3778: SNESTSFormFunction_Theta (theta.c:204)
==55789==    by 0x100DC2D27: SNESTSFormFunction (ts.c:2560)
==55789==    by 0x100CACE87: SNESComputeFunction (snes.c:1901)
==55789==    by 0x100CF7651: SNESSolve_LS (ls.c:161)
==55789==    by 0x100CBF982: SNESSolve (snes.c:3549)
==55789==    by 0x100DA223F: TSStep_Theta (theta.c:114)
==55789==    by 0x100DBC4B9: TSStep (ts.c:1827)
==55789==    by 0x100DBDA12: TSSolve (ts.c:1951)
==55789==    by 0x1000047A8: main (ex26.c:327)
==55789== 
==55789== Conditional jump or move depends on uninitialised value(s)
==55789==    at 0x102738FE1: sigaction (in /usr/lib/libSystem.B.dylib)
==55789==    by 0x102738FA6: signal__ (in /usr/lib/libSystem.B.dylib)
==55789==    by 0x100181B19: PetscDefaultSignalHandler (signal.c:143)
==55789==    by 0x1001818AE: PetscSignalHandler_Private (signal.c:53)
==55789==    by 0x138048EF3: ???
==55789==    by 0x10000637F: FormIFunctionLocal (ex26.c:579)
==55789==    by 0x100007791: FormIFunction (ex26.c:680)
==55789==    by 0x100DACF67: TSComputeIFunction (ts.c:374)
==55789==    by 0x100DA3778: SNESTSFormFunction_Theta (theta.c:204)
==55789==    by 0x100DC2D27: SNESTSFormFunction (ts.c:2560)
==55789==    by 0x100CACE87: SNESComputeFunction (snes.c:1901)
==55789==    by 0x100CF7651: SNESSolve_LS (ls.c:161)
==55789==    by 0x100CBF982: SNESSolve (snes.c:3549)
==55789==    by 0x100DA223F: TSStep_Theta (theta.c:114)
==55789==    by 0x100DBC4B9: TSStep (ts.c:1827)
==55789==    by 0x100DBDA12: TSSolve (ts.c:1951)
==55789==    by 0x1000047A8: main (ex26.c:327)
==55789== 
==55789== Syscall param sigaction(signum) contains uninitialised byte(s)
==55789==    at 0x102739032: __sigaction (in /usr/lib/libSystem.B.dylib)
==55789==    by 0x102738FA6: signal__ (in /usr/lib/libSystem.B.dylib)
==55789==    by 0x100181B19: PetscDefaultSignalHandler (signal.c:143)
==55789==    by 0x1001818AE: PetscSignalHandler_Private (signal.c:53)
==55789==    by 0x138048EF3: ???
==55789==    by 0x10000637F: FormIFunctionLocal (ex26.c:579)
==55789==    by 0x100007791: FormIFunction (ex26.c:680)
==55789==    by 0x100DACF67: TSComputeIFunction (ts.c:374)
==55789==    by 0x100DA3778: SNESTSFormFunction_Theta (theta.c:204)
==55789==    by 0x100DC2D27: SNESTSFormFunction (ts.c:2560)
==55789==    by 0x100CACE87: SNESComputeFunction (snes.c:1901)
==55789==    by 0x100CF7651: SNESSolve_LS (ls.c:161)
==55789==    by 0x100CBF982: SNESSolve (snes.c:3549)
==55789==    by 0x100DA223F: TSStep_Theta (theta.c:114)
==55789==    by 0x100DBC4B9: TSStep (ts.c:1827)
==55789==    by 0x100DBDA12: TSSolve (ts.c:1951)
==55789==    by 0x1000047A8: main (ex26.c:327)
==55789== 
==55789== 
==55789== Process terminating with default action of signal 11 (SIGSEGV)
==55789==  General Protection Fault
==55789==    at 0x1027F0FC1: dyld_stub_binder (in /usr/lib/libSystem.B.dylib)
==55789==    by 0x101C4549F: ??? (in /Users/barrysmith/Src/petsc-dev/arch-gnu/lib/libpetsc.dylib)
==55789==    by 0x100181B31: PetscDefaultSignalHandler (signal.c:144)
==55789==    by 0x1001818AE: PetscSignalHandler_Private (signal.c:53)
==55789==    by 0x138048EF3: ???
==55789==    by 0x10000637F: FormIFunctionLocal (ex26.c:579)
==55789==    by 0x100007791: FormIFunction (ex26.c:680)
==55789==    by 0x100DACF67: TSComputeIFunction (ts.c:374)
==55789==    by 0x100DA3778: SNESTSFormFunction_Theta (theta.c:204)
==55789==    by 0x100DC2D27: SNESTSFormFunction (ts.c:2560)
==55789==    by 0x100CACE87: SNESComputeFunction (snes.c:1901)
==55789==    by 0x100CF7651: SNESSolve_LS (ls.c:161)
==55789==    by 0x100CBF982: SNESSolve (snes.c:3549)
==55789==    by 0x100DA223F: TSStep_Theta (theta.c:114)
==55789==    by 0x100DBC4B9: TSStep (ts.c:1827)
==55789==    by 0x100DBDA12: TSSolve (ts.c:1951)
==55789==    by 0x1000047A8: main (ex26.c:327)
==55788== Invalid read of size 8
==55788==    at 0x100006443: FormIFunctionLocal (ex26.c:581)
==55788==    by 0x100007791: FormIFunction (ex26.c:680)
==55788==    by 0x100DACF67: TSComputeIFunction (ts.c:374)
==55788==    by 0x100DA3778: SNESTSFormFunction_Theta (theta.c:204)
==55788==    by 0x100DC2D27: SNESTSFormFunction (ts.c:2560)
==55788==    by 0x100CACE87: SNESComputeFunction (snes.c:1901)
==55788==    by 0x100CF7651: SNESSolve_LS (ls.c:161)
==55788==    by 0x100CBF982: SNESSolve (snes.c:3549)
==55788==    by 0x100DA223F: TSStep_Theta (theta.c:114)
==55788==    by 0x100DBC4B9: TSStep (ts.c:1827)
==55788==    by 0x100DBDA12: TSSolve (ts.c:1951)
==55788==    by 0x1000047A8: main (ex26.c:327)
==55788==  Address 0x1045cf920 is 1,936 bytes inside a block of size 1,940 alloc'd
==55788==    at 0x1000251EF: malloc (vg_replace_malloc.c:236)
==55788==    by 0x1000CC71B: PetscMallocAlign (mal.c:37)
==55788==    by 0x1000CDF89: PetscTrMallocDefault (mtr.c:190)
==55788==    by 0x1002B338B: VecGetArray2d (rvector.c:1759)
==55788==    by 0x1009430B0: DMDAVecGetArray (dagetarray.c:72)
==55788==    by 0x100007536: FormIFunction (ex26.c:673)
==55788==    by 0x100DACF67: TSComputeIFunction (ts.c:374)
==55788==    by 0x100DA3778: SNESTSFormFunction_Theta (theta.c:204)
==55788==    by 0x100DC2D27: SNESTSFormFunction (ts.c:2560)
==55788==    by 0x100CACE87: SNESComputeFunction (snes.c:1901)
==55788==    by 0x100CF7651: SNESSolve_LS (ls.c:161)
==55788==    by 0x100CBF982: SNESSolve (snes.c:3549)
==55788==    by 0x100DA223F: TSStep_Theta (theta.c:114)
==55788==    by 0x100DBC4B9: TSStep (ts.c:1827)
==55788==    by 0x100DBDA12: TSSolve (ts.c:1951)
==55788==    by 0x1000047A8: main (ex26.c:327)
==55788== 
==55788== 
==55788== ---- Attach to debugger ? --- [Return/N/n/Y/y/C/c] ---- 
=====================================================================================
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   EXIT CODE: 11
=   CLEANING UP REMAINING PROCESSES
=   YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
=================================================================================




More information about the petsc-dev mailing list