[petsc-dev] simple way to use --db-attach with valgrind in parallel with mpiexec
Jed Brown
jedbrown at mcs.anl.gov
Sun Jul 1 15:34:03 CDT 2012
Seems that way, it works for me on Linux.
On Sat, Jun 30, 2012 at 7:40 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> Both windows just disappear and program ends if I hit y or Y. Maybe a
> Mac thing.
> Barry
> On Jun 30, 2012, at 7:51 PM, Jed Brown wrote:
> > Did you try this?
> >
> > mpiexec -n 2 xterm -e valgrind --db-attach=yes ./ex -args
> >
> > On Sat, Jun 30, 2012 at 4:33 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> > Has anyone documented a simple way to run parallel programs with
> mpiexec and valgrind and use the --db-attach=yes option to use the debugger
> on a process to see what is going on?
> >
> > My simple minded attempt leads to uselessness; it tries to ask me
> about using the debugger but seems to be in tty hell.
> >
> > Thanks
> >
> > Barry
> > barry-smiths-macbook-pro:LuLu barrysmith$
> ~/Src/petsc-dev/arch-gnu/bin/mpiexec -n 2 valgrind --tool=memcheck
> --db-attach=yes --dsymutil=yes --num-callers=20 -q ./ex26 -ts_type beuler
> > --55788-- run: /usr/bin/dsymutil "./ex26"
> > --55789-- run: /usr/bin/dsymutil "./ex26"
> > error: No such file or directory - old dSYM file cannot be overwritten:
> old: './ex26.dSYM/Contents/Resources/DWARF/ex26'
> > new:'./ex26.dSYM/Contents/Resources/DWARF/ex26.x86_64'.
> > --55789-- run: /usr/bin/dsymutil
> "/Users/barrysmith/Src/petsc-dev/arch-gnu/lib/libpetsc.dylib"
> > --55788-- run: /usr/bin/dsymutil
> "/Users/barrysmith/Src/petsc-dev/arch-gnu/lib/libpetsc.dylib"
> > error: No such file or directory - old dSYM file cannot be overwritten:
> old:
> '/Users/barrysmith/Src/petsc-dev/arch-gnu/lib/libpetsc.dylib.dSYM/Contents/Resources/DWARF/libpetsc.dylib'
> new:'/Users/barrysmith/Src/petsc-dev/arch-gnu/lib/libpetsc.dylib.dSYM/Contents/Resources/DWARF/libpetsc.dylib.x86_64'.
> > --55788-- run: /usr/bin/dsymutil
> "/Users/barrysmith/Src/petsc-dev/arch-gnu/lib/libparmetis.dylib"
> > --55789-- run: /usr/bin/dsymutil
> "/Users/barrysmith/Src/petsc-dev/arch-gnu/lib/libparmetis.dylib"
> > --55788-- run: /usr/bin/dsymutil
> "/Users/barrysmith/Src/petsc-dev/arch-gnu/lib/libmetis.dylib"
> > --55789-- run: /usr/bin/dsymutil
> "/Users/barrysmith/Src/petsc-dev/arch-gnu/lib/libmetis.dylib"
> > warning: (x86_64)
> /Users/barrysmith/Src/petsc-dev/externalpackages/mpich2-1.4.1p1/lib/.tmp/wtimef.o
> unable to open object file
> > warning: no debug symbols in executable (-arch x86_64)
> > 4x4 grid, water viscosity = 1, oil vescosity # = 1,
> > connate water saturation # = 0 irreducible oil saturation # =0
> > timestep 0 time 0
> > ==55789== Use of uninitialised value of size 8
> > ==55789== at 0x100002B3C: Mobility (ex26.c:116)
> > ==55789== by 0x10000637F: FormIFunctionLocal (ex26.c:579)
> > ==55789== by 0x100007791: FormIFunction (ex26.c:680)
> > ==55789== by 0x100DACF67: TSComputeIFunction (ts.c:374)
> > ==55789== by 0x100DA3778: SNESTSFormFunction_Theta (theta.c:204)
> > ==55789== by 0x100DC2D27: SNESTSFormFunction (ts.c:2560)
> > ==55789== by 0x100CACE87: SNESComputeFunction (snes.c:1901)
> > ==55789== by 0x100CF7651: SNESSolve_LS (ls.c:161)
> > ==55789== by 0x100CBF982: SNESSolve (snes.c:3549)
> > ==55789== by 0x100DA223F: TSStep_Theta (theta.c:114)
> > ==55789== by 0x100DBC4B9: TSStep (ts.c:1827)
> > ==55789== by 0x100DBDA12: TSSolve (ts.c:1951)
> > ==55789== by 0x1000047A8: main (ex26.c:327)
> > ==55789== ---- Attach to debugger ? --- [Return/N/n/Y/y/C/c] ----
> ==55789== Invalid read of size 8
> > ==55789== at 0x100002B3C: Mobility (ex26.c:116)
> > ==55789== by 0x10000637F: FormIFunctionLocal (ex26.c:579)
> > ==55789== by 0x100007791: FormIFunction (ex26.c:680)
> > ==55789== by 0x100DACF67: TSComputeIFunction (ts.c:374)
> > ==55789== by 0x100DA3778: SNESTSFormFunction_Theta (theta.c:204)
> > ==55789== by 0x100DC2D27: SNESTSFormFunction (ts.c:2560)
> > ==55789== by 0x100CACE87: SNESComputeFunction (snes.c:1901)
> > ==55789== by 0x100CF7651: SNESSolve_LS (ls.c:161)
> > ==55789== by 0x100CBF982: SNESSolve (snes.c:3549)
> > ==55789== by 0x100DA223F: TSStep_Theta (theta.c:114)
> > ==55789== by 0x100DBC4B9: TSStep (ts.c:1827)
> > ==55789== by 0x100DBDA12: TSSolve (ts.c:1951)
> > ==55789== by 0x1000047A8: main (ex26.c:327)
> > ==55789== Address 0x8 is not stack'd, malloc'd or (recently) free'd
> > ==55789== Conditional jump or move depends on uninitialised value(s)
> > ==55789== at 0x102738F7B: signal__ (in /usr/lib/libSystem.B.dylib)
> > ==55789== by 0x100181B19: PetscDefaultSignalHandler (signal.c:143)
> > ==55789== by 0x1001818AE: PetscSignalHandler_Private (signal.c:53)
> > ==55789== by 0x138048EF3: ???
> > ==55789== by 0x10000637F: FormIFunctionLocal (ex26.c:579)
> > ==55789== by 0x100007791: FormIFunction (ex26.c:680)
> > ==55789== by 0x100DACF67: TSComputeIFunction (ts.c:374)
> > ==55789== by 0x100DA3778: SNESTSFormFunction_Theta (theta.c:204)
> > ==55789== by 0x100DC2D27: SNESTSFormFunction (ts.c:2560)
> > ==55789== by 0x100CACE87: SNESComputeFunction (snes.c:1901)
> > ==55789== by 0x100CF7651: SNESSolve_LS (ls.c:161)
> > ==55789== by 0x100CBF982: SNESSolve (snes.c:3549)
> > ==55789== by 0x100DA223F: TSStep_Theta (theta.c:114)
> > ==55789== by 0x100DBC4B9: TSStep (ts.c:1827)
> > ==55789== by 0x100DBDA12: TSSolve (ts.c:1951)
> > ==55789== by 0x1000047A8: main (ex26.c:327)
> > ==55789== Conditional jump or move depends on uninitialised value(s)
> > ==55789== at 0x102738FD7: sigaction (in /usr/lib/libSystem.B.dylib)
> > ==55789== by 0x102738FA6: signal__ (in /usr/lib/libSystem.B.dylib)
> > ==55789== by 0x100181B19: PetscDefaultSignalHandler (signal.c:143)
> > ==55789== by 0x1001818AE: PetscSignalHandler_Private (signal.c:53)
> > ==55789== by 0x138048EF3: ???
> > ==55789== by 0x10000637F: FormIFunctionLocal (ex26.c:579)
> > ==55789== by 0x100007791: FormIFunction (ex26.c:680)
> > ==55789== by 0x100DACF67: TSComputeIFunction (ts.c:374)
> > ==55789== by 0x100DA3778: SNESTSFormFunction_Theta (theta.c:204)
> > ==55789== by 0x100DC2D27: SNESTSFormFunction (ts.c:2560)
> > ==55789== by 0x100CACE87: SNESComputeFunction (snes.c:1901)
> > ==55789== by 0x100CF7651: SNESSolve_LS (ls.c:161)
> > ==55789== by 0x100CBF982: SNESSolve (snes.c:3549)
> > ==55789== by 0x100DA223F: TSStep_Theta (theta.c:114)
> > ==55789== by 0x100DBC4B9: TSStep (ts.c:1827)
> > ==55789== by 0x100DBDA12: TSSolve (ts.c:1951)
> > ==55789== by 0x1000047A8: main (ex26.c:327)
> > ==55789== Conditional jump or move depends on uninitialised value(s)
> > ==55789== at 0x102738FDC: sigaction (in /usr/lib/libSystem.B.dylib)
> > ==55789== by 0x102738FA6: signal__ (in /usr/lib/libSystem.B.dylib)
> > ==55789== by 0x100181B19: PetscDefaultSignalHandler (signal.c:143)
> > ==55789== by 0x1001818AE: PetscSignalHandler_Private (signal.c:53)
> > ==55789== by 0x138048EF3: ???
> > ==55789== by 0x10000637F: FormIFunctionLocal (ex26.c:579)
> > ==55789== by 0x100007791: FormIFunction (ex26.c:680)
> > ==55789== by 0x100DACF67: TSComputeIFunction (ts.c:374)
> > ==55789== by 0x100DA3778: SNESTSFormFunction_Theta (theta.c:204)
> > ==55789== by 0x100DC2D27: SNESTSFormFunction (ts.c:2560)
> > ==55789== by 0x100CACE87: SNESComputeFunction (snes.c:1901)
> > ==55789== by 0x100CF7651: SNESSolve_LS (ls.c:161)
> > ==55789== by 0x100CBF982: SNESSolve (snes.c:3549)
> > ==55789== by 0x100DA223F: TSStep_Theta (theta.c:114)
> > ==55789== by 0x100DBC4B9: TSStep (ts.c:1827)
> > ==55789== by 0x100DBDA12: TSSolve (ts.c:1951)
> > ==55789== by 0x1000047A8: main (ex26.c:327)
> > ==55789== Conditional jump or move depends on uninitialised value(s)
> > ==55789== at 0x102738FE1: sigaction (in /usr/lib/libSystem.B.dylib)
> > ==55789== by 0x102738FA6: signal__ (in /usr/lib/libSystem.B.dylib)
> > ==55789== by 0x100181B19: PetscDefaultSignalHandler (signal.c:143)
> > ==55789== by 0x1001818AE: PetscSignalHandler_Private (signal.c:53)
> > ==55789== by 0x138048EF3: ???
> > ==55789== by 0x10000637F: FormIFunctionLocal (ex26.c:579)
> > ==55789== by 0x100007791: FormIFunction (ex26.c:680)
> > ==55789== by 0x100DACF67: TSComputeIFunction (ts.c:374)
> > ==55789== by 0x100DA3778: SNESTSFormFunction_Theta (theta.c:204)
> > ==55789== by 0x100DC2D27: SNESTSFormFunction (ts.c:2560)
> > ==55789== by 0x100CACE87: SNESComputeFunction (snes.c:1901)
> > ==55789== by 0x100CF7651: SNESSolve_LS (ls.c:161)
> > ==55789== by 0x100CBF982: SNESSolve (snes.c:3549)
> > ==55789== by 0x100DA223F: TSStep_Theta (theta.c:114)
> > ==55789== by 0x100DBC4B9: TSStep (ts.c:1827)
> > ==55789== by 0x100DBDA12: TSSolve (ts.c:1951)
> > ==55789== by 0x1000047A8: main (ex26.c:327)
> > ==55789== Syscall param sigaction(signum) contains uninitialised byte(s)
> > ==55789== at 0x102739032: __sigaction (in /usr/lib/libSystem.B.dylib)
> > ==55789== by 0x102738FA6: signal__ (in /usr/lib/libSystem.B.dylib)
> > ==55789== by 0x100181B19: PetscDefaultSignalHandler (signal.c:143)
> > ==55789== by 0x1001818AE: PetscSignalHandler_Private (signal.c:53)
> > ==55789== by 0x138048EF3: ???
> > ==55789== by 0x10000637F: FormIFunctionLocal (ex26.c:579)
> > ==55789== by 0x100007791: FormIFunction (ex26.c:680)
> > ==55789== by 0x100DACF67: TSComputeIFunction (ts.c:374)
> > ==55789== by 0x100DA3778: SNESTSFormFunction_Theta (theta.c:204)
> > ==55789== by 0x100DC2D27: SNESTSFormFunction (ts.c:2560)
> > ==55789== by 0x100CACE87: SNESComputeFunction (snes.c:1901)
> > ==55789== by 0x100CF7651: SNESSolve_LS (ls.c:161)
> > ==55789== by 0x100CBF982: SNESSolve (snes.c:3549)
> > ==55789== by 0x100DA223F: TSStep_Theta (theta.c:114)
> > ==55789== by 0x100DBC4B9: TSStep (ts.c:1827)
> > ==55789== by 0x100DBDA12: TSSolve (ts.c:1951)
> > ==55789== by 0x1000047A8: main (ex26.c:327)
> > ==55789== Process terminating with default action of signal 11 (SIGSEGV)
> > ==55789== General Protection Fault
> > ==55789== at 0x1027F0FC1: dyld_stub_binder (in
> /usr/lib/libSystem.B.dylib)
> > ==55789== by 0x101C4549F: ??? (in
> /Users/barrysmith/Src/petsc-dev/arch-gnu/lib/libpetsc.dylib)
> > ==55789== by 0x100181B31: PetscDefaultSignalHandler (signal.c:144)
> > ==55789== by 0x1001818AE: PetscSignalHandler_Private (signal.c:53)
> > ==55789== by 0x138048EF3: ???
> > ==55789== by 0x10000637F: FormIFunctionLocal (ex26.c:579)
> > ==55789== by 0x100007791: FormIFunction (ex26.c:680)
> > ==55789== by 0x100DACF67: TSComputeIFunction (ts.c:374)
> > ==55789== by 0x100DA3778: SNESTSFormFunction_Theta (theta.c:204)
> > ==55789== by 0x100DC2D27: SNESTSFormFunction (ts.c:2560)
> > ==55789== by 0x100CACE87: SNESComputeFunction (snes.c:1901)
> > ==55789== by 0x100CF7651: SNESSolve_LS (ls.c:161)
> > ==55789== by 0x100CBF982: SNESSolve (snes.c:3549)
> > ==55789== by 0x100DA223F: TSStep_Theta (theta.c:114)
> > ==55789== by 0x100DBC4B9: TSStep (ts.c:1827)
> > ==55789== by 0x100DBDA12: TSSolve (ts.c:1951)
> > ==55789== by 0x1000047A8: main (ex26.c:327)
> > ==55788== Invalid read of size 8
> > ==55788== at 0x100006443: FormIFunctionLocal (ex26.c:581)
> > ==55788== by 0x100007791: FormIFunction (ex26.c:680)
> > ==55788== by 0x100DACF67: TSComputeIFunction (ts.c:374)
> > ==55788== by 0x100DA3778: SNESTSFormFunction_Theta (theta.c:204)
> > ==55788== by 0x100DC2D27: SNESTSFormFunction (ts.c:2560)
> > ==55788== by 0x100CACE87: SNESComputeFunction (snes.c:1901)
> > ==55788== by 0x100CF7651: SNESSolve_LS (ls.c:161)
> > ==55788== by 0x100CBF982: SNESSolve (snes.c:3549)
> > ==55788== by 0x100DA223F: TSStep_Theta (theta.c:114)
> > ==55788== by 0x100DBC4B9: TSStep (ts.c:1827)
> > ==55788== by 0x100DBDA12: TSSolve (ts.c:1951)
> > ==55788== by 0x1000047A8: main (ex26.c:327)
> > ==55788== Address 0x1045cf920 is 1,936 bytes inside a block of size
> 1,940 alloc'd
> > ==55788== at 0x1000251EF: malloc (vg_replace_malloc.c:236)
> > ==55788== by 0x1000CC71B: PetscMallocAlign (mal.c:37)
> > ==55788== by 0x1000CDF89: PetscTrMallocDefault (mtr.c:190)
> > ==55788== by 0x1002B338B: VecGetArray2d (rvector.c:1759)
> > ==55788== by 0x1009430B0: DMDAVecGetArray (dagetarray.c:72)
> > ==55788== by 0x100007536: FormIFunction (ex26.c:673)
> > ==55788== by 0x100DACF67: TSComputeIFunction (ts.c:374)
> > ==55788== by 0x100DA3778: SNESTSFormFunction_Theta (theta.c:204)
> > ==55788== by 0x100DC2D27: SNESTSFormFunction (ts.c:2560)
> > ==55788== by 0x100CACE87: SNESComputeFunction (snes.c:1901)
> > ==55788== by 0x100CF7651: SNESSolve_LS (ls.c:161)
> > ==55788== by 0x100CBF982: SNESSolve (snes.c:3549)
> > ==55788== by 0x100DA223F: TSStep_Theta (theta.c:114)
> > ==55788== by 0x100DBC4B9: TSStep (ts.c:1827)
> > ==55788== by 0x100DBDA12: TSSolve (ts.c:1951)
> > ==55788== by 0x1000047A8: main (ex26.c:327)
> > ==55788== ---- Attach to debugger ? --- [Return/N/n/Y/y/C/c] ----
> =====================================================================================
> > = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
> > = EXIT CODE: 11
> > = CLEANING UP REMAINING PROCESSES
> > = YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
> =================================================================================
