[petsc-dev] simple way to use --db-attach with valgrind in parallel with mpiexec

Barry Smith bsmith at mcs.anl.gov
Mon Jul 2 22:54:12 CDT 2012


On Jul 1, 2012, at 3:34 PM, Jed Brown wrote:

> Seems that way, it works for me on Linux.

   Yup.

    BTW: added support for this to petscmpiexec the script that only I use

   Barry

> 
> On Sat, Jun 30, 2012 at 7:40 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> 
>    Both windows just disappear and program ends if I hit y or Y. Maybe a Mac thing.
> 
>    Barry
> 
> On Jun 30, 2012, at 7:51 PM, Jed Brown wrote:
> 
> > Did you try this?
> >
> > mpiexec -n 2 xterm -e valgrind --db-attach=yes ./ex -args
> >
> > On Sat, Jun 30, 2012 at 4:33 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> >
> >    Has anyone documented a simple way to run parallel programs with mpiexec and valgrind and use the --db-attach=yes option to use the debugger on a process to see what is going on?
> >
> >    My simple minded attempt leads to uselessness; it tries to ask me about using the debugger but seems to be in tty hell.
> >
> >     Thanks
> >
> >     Barry
> >
> >
> > barry-smiths-macbook-pro:LuLu barrysmith$ ~/Src/petsc-dev/arch-gnu/bin/mpiexec -n 2 valgrind --tool=memcheck --db-attach=yes --dsymutil=yes --num-callers=20  -q  ./ex26 -ts_type beuler
> > --55788-- run: /usr/bin/dsymutil "./ex26"
> > --55789-- run: /usr/bin/dsymutil "./ex26"
> > error: No such file or directory - old dSYM file cannot be overwritten:         old: './ex26.dSYM/Contents/Resources/DWARF/ex26'
> >         new:'./ex26.dSYM/Contents/Resources/DWARF/ex26.x86_64'.
> > --55789-- run: /usr/bin/dsymutil "/Users/barrysmith/Src/petsc-dev/arch-gnu/lib/libpetsc.dylib"
> > --55788-- run: /usr/bin/dsymutil "/Users/barrysmith/Src/petsc-dev/arch-gnu/lib/libpetsc.dylib"
> > error: No such file or directory - old dSYM file cannot be overwritten:         old: '/Users/barrysmith/Src/petsc-dev/arch-gnu/lib/libpetsc.dylib.dSYM/Contents/Resources/DWARF/libpetsc.dylib'
> >         new:'/Users/barrysmith/Src/petsc-dev/arch-gnu/lib/libpetsc.dylib.dSYM/Contents/Resources/DWARF/libpetsc.dylib.x86_64'.
> > --55788-- run: /usr/bin/dsymutil "/Users/barrysmith/Src/petsc-dev/arch-gnu/lib/libparmetis.dylib"
> > --55789-- run: /usr/bin/dsymutil "/Users/barrysmith/Src/petsc-dev/arch-gnu/lib/libparmetis.dylib"
> > --55788-- run: /usr/bin/dsymutil "/Users/barrysmith/Src/petsc-dev/arch-gnu/lib/libmetis.dylib"
> > --55789-- run: /usr/bin/dsymutil "/Users/barrysmith/Src/petsc-dev/arch-gnu/lib/libmetis.dylib"
> >
> > warning: (x86_64) /Users/barrysmith/Src/petsc-dev/externalpackages/mpich2-1.4.1p1/lib/.tmp/wtimef.o unable to open object file
> > warning: no debug symbols in executable (-arch x86_64)
> > 4x4 grid, water viscosity = 1, oil vescosity # = 1,
> >  connate water saturation # = 0 irreducible oil saturation # =0
> > timestep 0 time 0
> > ==55789== Use of uninitialised value of size 8
> > ==55789==    at 0x100002B3C: Mobility (ex26.c:116)
> > ==55789==    by 0x10000637F: FormIFunctionLocal (ex26.c:579)
> > ==55789==    by 0x100007791: FormIFunction (ex26.c:680)
> > ==55789==    by 0x100DACF67: TSComputeIFunction (ts.c:374)
> > ==55789==    by 0x100DA3778: SNESTSFormFunction_Theta (theta.c:204)
> > ==55789==    by 0x100DC2D27: SNESTSFormFunction (ts.c:2560)
> > ==55789==    by 0x100CACE87: SNESComputeFunction (snes.c:1901)
> > ==55789==    by 0x100CF7651: SNESSolve_LS (ls.c:161)
> > ==55789==    by 0x100CBF982: SNESSolve (snes.c:3549)
> > ==55789==    by 0x100DA223F: TSStep_Theta (theta.c:114)
> > ==55789==    by 0x100DBC4B9: TSStep (ts.c:1827)
> > ==55789==    by 0x100DBDA12: TSSolve (ts.c:1951)
> > ==55789==    by 0x1000047A8: main (ex26.c:327)
> > ==55789==
> > ==55789==
> > ==55789== ---- Attach to debugger ? --- [Return/N/n/Y/y/C/c] ---- ==55789== Invalid read of size 8
> > ==55789==    at 0x100002B3C: Mobility (ex26.c:116)
> > ==55789==    by 0x10000637F: FormIFunctionLocal (ex26.c:579)
> > ==55789==    by 0x100007791: FormIFunction (ex26.c:680)
> > ==55789==    by 0x100DACF67: TSComputeIFunction (ts.c:374)
> > ==55789==    by 0x100DA3778: SNESTSFormFunction_Theta (theta.c:204)
> > ==55789==    by 0x100DC2D27: SNESTSFormFunction (ts.c:2560)
> > ==55789==    by 0x100CACE87: SNESComputeFunction (snes.c:1901)
> > ==55789==    by 0x100CF7651: SNESSolve_LS (ls.c:161)
> > ==55789==    by 0x100CBF982: SNESSolve (snes.c:3549)
> > ==55789==    by 0x100DA223F: TSStep_Theta (theta.c:114)
> > ==55789==    by 0x100DBC4B9: TSStep (ts.c:1827)
> > ==55789==    by 0x100DBDA12: TSSolve (ts.c:1951)
> > ==55789==    by 0x1000047A8: main (ex26.c:327)
> > ==55789==  Address 0x8 is not stack'd, malloc'd or (recently) free'd
> > ==55789==
> > ==55789== Conditional jump or move depends on uninitialised value(s)
> > ==55789==    at 0x102738F7B: signal__ (in /usr/lib/libSystem.B.dylib)
> > ==55789==    by 0x100181B19: PetscDefaultSignalHandler (signal.c:143)
> > ==55789==    by 0x1001818AE: PetscSignalHandler_Private (signal.c:53)
> > ==55789==    by 0x138048EF3: ???
> > ==55789==    by 0x10000637F: FormIFunctionLocal (ex26.c:579)
> > ==55789==    by 0x100007791: FormIFunction (ex26.c:680)
> > ==55789==    by 0x100DACF67: TSComputeIFunction (ts.c:374)
> > ==55789==    by 0x100DA3778: SNESTSFormFunction_Theta (theta.c:204)
> > ==55789==    by 0x100DC2D27: SNESTSFormFunction (ts.c:2560)
> > ==55789==    by 0x100CACE87: SNESComputeFunction (snes.c:1901)
> > ==55789==    by 0x100CF7651: SNESSolve_LS (ls.c:161)
> > ==55789==    by 0x100CBF982: SNESSolve (snes.c:3549)
> > ==55789==    by 0x100DA223F: TSStep_Theta (theta.c:114)
> > ==55789==    by 0x100DBC4B9: TSStep (ts.c:1827)
> > ==55789==    by 0x100DBDA12: TSSolve (ts.c:1951)
> > ==55789==    by 0x1000047A8: main (ex26.c:327)
> > ==55789==
> > ==55789== Conditional jump or move depends on uninitialised value(s)
> > ==55789==    at 0x102738FD7: sigaction (in /usr/lib/libSystem.B.dylib)
> > ==55789==    by 0x102738FA6: signal__ (in /usr/lib/libSystem.B.dylib)
> > ==55789==    by 0x100181B19: PetscDefaultSignalHandler (signal.c:143)
> > ==55789==    by 0x1001818AE: PetscSignalHandler_Private (signal.c:53)
> > ==55789==    by 0x138048EF3: ???
> > ==55789==    by 0x10000637F: FormIFunctionLocal (ex26.c:579)
> > ==55789==    by 0x100007791: FormIFunction (ex26.c:680)
> > ==55789==    by 0x100DACF67: TSComputeIFunction (ts.c:374)
> > ==55789==    by 0x100DA3778: SNESTSFormFunction_Theta (theta.c:204)
> > ==55789==    by 0x100DC2D27: SNESTSFormFunction (ts.c:2560)
> > ==55789==    by 0x100CACE87: SNESComputeFunction (snes.c:1901)
> > ==55789==    by 0x100CF7651: SNESSolve_LS (ls.c:161)
> > ==55789==    by 0x100CBF982: SNESSolve (snes.c:3549)
> > ==55789==    by 0x100DA223F: TSStep_Theta (theta.c:114)
> > ==55789==    by 0x100DBC4B9: TSStep (ts.c:1827)
> > ==55789==    by 0x100DBDA12: TSSolve (ts.c:1951)
> > ==55789==    by 0x1000047A8: main (ex26.c:327)
> > ==55789==
> > ==55789== Conditional jump or move depends on uninitialised value(s)
> > ==55789==    at 0x102738FDC: sigaction (in /usr/lib/libSystem.B.dylib)
> > ==55789==    by 0x102738FA6: signal__ (in /usr/lib/libSystem.B.dylib)
> > ==55789==    by 0x100181B19: PetscDefaultSignalHandler (signal.c:143)
> > ==55789==    by 0x1001818AE: PetscSignalHandler_Private (signal.c:53)
> > ==55789==    by 0x138048EF3: ???
> > ==55789==    by 0x10000637F: FormIFunctionLocal (ex26.c:579)
> > ==55789==    by 0x100007791: FormIFunction (ex26.c:680)
> > ==55789==    by 0x100DACF67: TSComputeIFunction (ts.c:374)
> > ==55789==    by 0x100DA3778: SNESTSFormFunction_Theta (theta.c:204)
> > ==55789==    by 0x100DC2D27: SNESTSFormFunction (ts.c:2560)
> > ==55789==    by 0x100CACE87: SNESComputeFunction (snes.c:1901)
> > ==55789==    by 0x100CF7651: SNESSolve_LS (ls.c:161)
> > ==55789==    by 0x100CBF982: SNESSolve (snes.c:3549)
> > ==55789==    by 0x100DA223F: TSStep_Theta (theta.c:114)
> > ==55789==    by 0x100DBC4B9: TSStep (ts.c:1827)
> > ==55789==    by 0x100DBDA12: TSSolve (ts.c:1951)
> > ==55789==    by 0x1000047A8: main (ex26.c:327)
> > ==55789==
> > ==55789== Conditional jump or move depends on uninitialised value(s)
> > ==55789==    at 0x102738FE1: sigaction (in /usr/lib/libSystem.B.dylib)
> > ==55789==    by 0x102738FA6: signal__ (in /usr/lib/libSystem.B.dylib)
> > ==55789==    by 0x100181B19: PetscDefaultSignalHandler (signal.c:143)
> > ==55789==    by 0x1001818AE: PetscSignalHandler_Private (signal.c:53)
> > ==55789==    by 0x138048EF3: ???
> > ==55789==    by 0x10000637F: FormIFunctionLocal (ex26.c:579)
> > ==55789==    by 0x100007791: FormIFunction (ex26.c:680)
> > ==55789==    by 0x100DACF67: TSComputeIFunction (ts.c:374)
> > ==55789==    by 0x100DA3778: SNESTSFormFunction_Theta (theta.c:204)
> > ==55789==    by 0x100DC2D27: SNESTSFormFunction (ts.c:2560)
> > ==55789==    by 0x100CACE87: SNESComputeFunction (snes.c:1901)
> > ==55789==    by 0x100CF7651: SNESSolve_LS (ls.c:161)
> > ==55789==    by 0x100CBF982: SNESSolve (snes.c:3549)
> > ==55789==    by 0x100DA223F: TSStep_Theta (theta.c:114)
> > ==55789==    by 0x100DBC4B9: TSStep (ts.c:1827)
> > ==55789==    by 0x100DBDA12: TSSolve (ts.c:1951)
> > ==55789==    by 0x1000047A8: main (ex26.c:327)
> > ==55789==
> > ==55789== Syscall param sigaction(signum) contains uninitialised byte(s)
> > ==55789==    at 0x102739032: __sigaction (in /usr/lib/libSystem.B.dylib)
> > ==55789==    by 0x102738FA6: signal__ (in /usr/lib/libSystem.B.dylib)
> > ==55789==    by 0x100181B19: PetscDefaultSignalHandler (signal.c:143)
> > ==55789==    by 0x1001818AE: PetscSignalHandler_Private (signal.c:53)
> > ==55789==    by 0x138048EF3: ???
> > ==55789==    by 0x10000637F: FormIFunctionLocal (ex26.c:579)
> > ==55789==    by 0x100007791: FormIFunction (ex26.c:680)
> > ==55789==    by 0x100DACF67: TSComputeIFunction (ts.c:374)
> > ==55789==    by 0x100DA3778: SNESTSFormFunction_Theta (theta.c:204)
> > ==55789==    by 0x100DC2D27: SNESTSFormFunction (ts.c:2560)
> > ==55789==    by 0x100CACE87: SNESComputeFunction (snes.c:1901)
> > ==55789==    by 0x100CF7651: SNESSolve_LS (ls.c:161)
> > ==55789==    by 0x100CBF982: SNESSolve (snes.c:3549)
> > ==55789==    by 0x100DA223F: TSStep_Theta (theta.c:114)
> > ==55789==    by 0x100DBC4B9: TSStep (ts.c:1827)
> > ==55789==    by 0x100DBDA12: TSSolve (ts.c:1951)
> > ==55789==    by 0x1000047A8: main (ex26.c:327)
> > ==55789==
> > ==55789==
> > ==55789== Process terminating with default action of signal 11 (SIGSEGV)
> > ==55789==  General Protection Fault
> > ==55789==    at 0x1027F0FC1: dyld_stub_binder (in /usr/lib/libSystem.B.dylib)
> > ==55789==    by 0x101C4549F: ??? (in /Users/barrysmith/Src/petsc-dev/arch-gnu/lib/libpetsc.dylib)
> > ==55789==    by 0x100181B31: PetscDefaultSignalHandler (signal.c:144)
> > ==55789==    by 0x1001818AE: PetscSignalHandler_Private (signal.c:53)
> > ==55789==    by 0x138048EF3: ???
> > ==55789==    by 0x10000637F: FormIFunctionLocal (ex26.c:579)
> > ==55789==    by 0x100007791: FormIFunction (ex26.c:680)
> > ==55789==    by 0x100DACF67: TSComputeIFunction (ts.c:374)
> > ==55789==    by 0x100DA3778: SNESTSFormFunction_Theta (theta.c:204)
> > ==55789==    by 0x100DC2D27: SNESTSFormFunction (ts.c:2560)
> > ==55789==    by 0x100CACE87: SNESComputeFunction (snes.c:1901)
> > ==55789==    by 0x100CF7651: SNESSolve_LS (ls.c:161)
> > ==55789==    by 0x100CBF982: SNESSolve (snes.c:3549)
> > ==55789==    by 0x100DA223F: TSStep_Theta (theta.c:114)
> > ==55789==    by 0x100DBC4B9: TSStep (ts.c:1827)
> > ==55789==    by 0x100DBDA12: TSSolve (ts.c:1951)
> > ==55789==    by 0x1000047A8: main (ex26.c:327)
> > ==55788== Invalid read of size 8
> > ==55788==    at 0x100006443: FormIFunctionLocal (ex26.c:581)
> > ==55788==    by 0x100007791: FormIFunction (ex26.c:680)
> > ==55788==    by 0x100DACF67: TSComputeIFunction (ts.c:374)
> > ==55788==    by 0x100DA3778: SNESTSFormFunction_Theta (theta.c:204)
> > ==55788==    by 0x100DC2D27: SNESTSFormFunction (ts.c:2560)
> > ==55788==    by 0x100CACE87: SNESComputeFunction (snes.c:1901)
> > ==55788==    by 0x100CF7651: SNESSolve_LS (ls.c:161)
> > ==55788==    by 0x100CBF982: SNESSolve (snes.c:3549)
> > ==55788==    by 0x100DA223F: TSStep_Theta (theta.c:114)
> > ==55788==    by 0x100DBC4B9: TSStep (ts.c:1827)
> > ==55788==    by 0x100DBDA12: TSSolve (ts.c:1951)
> > ==55788==    by 0x1000047A8: main (ex26.c:327)
> > ==55788==  Address 0x1045cf920 is 1,936 bytes inside a block of size 1,940 alloc'd
> > ==55788==    at 0x1000251EF: malloc (vg_replace_malloc.c:236)
> > ==55788==    by 0x1000CC71B: PetscMallocAlign (mal.c:37)
> > ==55788==    by 0x1000CDF89: PetscTrMallocDefault (mtr.c:190)
> > ==55788==    by 0x1002B338B: VecGetArray2d (rvector.c:1759)
> > ==55788==    by 0x1009430B0: DMDAVecGetArray (dagetarray.c:72)
> > ==55788==    by 0x100007536: FormIFunction (ex26.c:673)
> > ==55788==    by 0x100DACF67: TSComputeIFunction (ts.c:374)
> > ==55788==    by 0x100DA3778: SNESTSFormFunction_Theta (theta.c:204)
> > ==55788==    by 0x100DC2D27: SNESTSFormFunction (ts.c:2560)
> > ==55788==    by 0x100CACE87: SNESComputeFunction (snes.c:1901)
> > ==55788==    by 0x100CF7651: SNESSolve_LS (ls.c:161)
> > ==55788==    by 0x100CBF982: SNESSolve (snes.c:3549)
> > ==55788==    by 0x100DA223F: TSStep_Theta (theta.c:114)
> > ==55788==    by 0x100DBC4B9: TSStep (ts.c:1827)
> > ==55788==    by 0x100DBDA12: TSSolve (ts.c:1951)
> > ==55788==    by 0x1000047A8: main (ex26.c:327)
> > ==55788==
> > ==55788==
> > ==55788== ---- Attach to debugger ? --- [Return/N/n/Y/y/C/c] ----
> > =====================================================================================
> > =   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
> > =   EXIT CODE: 11
> > =   CLEANING UP REMAINING PROCESSES
> > =   YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
> > =================================================================================
> >
> >
> 
> 




More information about the petsc-dev mailing list