[petsc-users] Valgrind unhandled instruction

Tabrez Ali stali at geology.wisc.edu
Wed May 21 17:57:52 CDT 2014


Sorry I missed the flags. Thanks for the clarification.

Tabrez

On 05/21/2014 05:27 PM, Satish Balay wrote:
> Looks like valgrind-3.7 doesn't know all instructions generated by
> "-O3 -march=native".
>
> And generally one should run valgrind with code compiled with '-g'
> anyway.
>
> I see similar issue with valgrind-3.7 - but the error goes away with
> valgrind-3.9 [compiled from source]
>
> Satish
>
> -----------
>
> balay at es^/scratch/balay/petsc/src/ksp/ksp/examples/tutorials(master) $ valgrind --version
> valgrind-3.7.0
> balay at es^/scratch/balay/petsc/src/ksp/ksp/examples/tutorials(master) $ valgrind --tool=memcheck -q  ./ex56 -ne 9 -alpha 1.e-3 -pc_type gamg -pc_gamg_type agg -pc_gamg_agg_nsmooths 1 -pc_gamg_coarse_eq_limit 10 -pc_gamg_reuse_interpolation true -two_solves -ksp_monitor_short -use_mat_nearnullspace
> vex amd64->IR: unhandled instruction bytes: 0xC5 0xFB 0x2A 0xC2 0xBA 0x1 0x0 0x0
> ==19041== valgrind: Unrecognised instruction at address 0x4ef760e.
> ==19041==    at 0x4EF760E: PetscSetDisplay (in /scratch/balay/petsc/arch-test/lib/libpetsc.so.3.04.4)
> ==19041==    by 0x4F4BB1D: PetscOptionsCheckInitial_Private (in /scratch/balay/petsc/arch-test/lib/libpetsc.so.3.04.4)
> ==19041==    by 0x4F51996: PetscInitialize (in /scratch/balay/petsc/arch-test/lib/libpetsc.so.3.04.4)
> ==19041==    by 0x401D15: main (in /scratch/balay/petsc/src/ksp/ksp/examples/tutorials/ex56)
> ==19041== Your program just tried to execute an instruction that Valgrind
> ==19041== did not recognise.  There are two possible reasons for this.
> ==19041== 1. Your program has a bug and erroneously jumped to a non-code
> ==19041==    location.  If you are running Memcheck and you just saw a
> ==19041==    warning about a bad jump, it's probably your program's fault.
> ==19041== 2. The instruction is legitimate but Valgrind doesn't handle it,
> ==19041==    i.e. it's Valgrind's fault.  If you think this is the case or
> ==19041==    you are not sure, please let us know and we'll try to fix it.
> ==19041== Either way, Valgrind will now raise a SIGILL signal which will
> ==19041== probably kill your program.
> ==19041==
> ==19041== Process terminating with default action of signal 4 (SIGILL)
> ==19041==  Illegal opcode at address 0x4EF760E
> ==19041==    at 0x4EF760E: PetscSetDisplay (in /scratch/balay/petsc/arch-test/lib/libpetsc.so.3.04.4)
> ==19041==    by 0x4F4BB1D: PetscOptionsCheckInitial_Private (in /scratch/balay/petsc/arch-test/lib/libpetsc.so.3.04.4)
> ==19041==    by 0x4F51996: PetscInitialize (in /scratch/balay/petsc/arch-test/lib/libpetsc.so.3.04.4)
> ==19041==    by 0x401D15: main (in /scratch/balay/petsc/src/ksp/ksp/examples/tutorials/ex56)
> Illegal instruction (core dumped)
> balay at es^/scratch/balay/petsc/src/ksp/ksp/examples/tutorials(master) $ /scratch/balay/valgrind-3.9.0/vg-in-place --version
> valgrind-3.9.0
> balay at es^/scratch/balay/petsc/src/ksp/ksp/examples/tutorials(master) $ /scratch/balay/valgrind-3.9.0/vg-in-place --tool=memcheck -q  ./ex56 -ne 9 -alpha 1.e-3 -pc_type gamg -pc_gamg_type agg -pc_gamg_agg_nsmooths 1 -pc_gamg_coarse_eq_limit 10 -pc_gamg_reuse_interpolation true -two_solves -ksp_monitor_short -use_mat_nearnullspace
>    0 KSP Residual norm 740.547
>    1 KSP Residual norm 104.004
>    2 KSP Residual norm 79.1334
>    3 KSP Residual norm 50.0497
>    4 KSP Residual norm 4.40859
>    5 KSP Residual norm 1.56451
>    6 KSP Residual norm 0.601773
>    7 KSP Residual norm 0.225864
>    8 KSP Residual norm 0.0122203
>    9 KSP Residual norm 0.00290625
>    0 KSP Residual norm 0.00740547
>    1 KSP Residual norm 0.00104004
>    2 KSP Residual norm 0.000791334
>    3 KSP Residual norm 0.000500497
>    4 KSP Residual norm 4.40859e-05
>    5 KSP Residual norm 1.56451e-05
>    6 KSP Residual norm 6.01773e-06
>    7 KSP Residual norm 2.25864e-06
>    8 KSP Residual norm 1.22203e-07
>    9 KSP Residual norm 2.90625e-08
>    0 KSP Residual norm 7.40547e-08
>    1 KSP Residual norm 1.04004e-08
>    2 KSP Residual norm 7.91334e-09
>    3 KSP Residual norm 5.00497e-09
>    4 KSP Residual norm 4.409e-10
>    5 KSP Residual norm 1.565e-10
>    6 KSP Residual norm 6.018e-11
>    7 KSP Residual norm 2.259e-11
>    8 KSP Residual norm<  1.e-11
>    9 KSP Residual norm<  1.e-11
> [0]main |b-Ax|/|b|=6.068344e-05, |b|=5.391826e+00, emax=9.964453e-01
> balay at es^/scratch/balay/petsc/src/ksp/ksp/examples/tutorials(master) $
>
>
> On Wed, 21 May 2014, Tabrez Ali wrote:
>
>> Hello
>>
>> With petsc-dev I get the following error with my own code and also with ex56
>> as shown below. Both run fine otherwise. This is with Valgrind 3.7 (in Debian
>> stable).
>>
>> Is this a PETSc or Valgrind issue?
>>
>> T
>>
>> stali at i5:~/petsc-dev/src/ksp/ksp/examples/tutorials$ valgrind ./ex56 -ne 9
>> -alpha 1.e-3 -pc_type gamg -pc_gamg_type agg -pc_gamg_agg_nsmooths 1
>> -pc_gamg_coarse_eq_limit 10 -pc_gamg_reuse_interpolation true -two_solves
>> -ksp_monitor_short -use_mat_nearnullspace
>> ==16123== Memcheck, a memory error detector
>> ==16123== Copyright (C) 2002-2011, and GNU GPL'd, by Julian Seward et al.
>> ==16123== Using Valgrind-3.7.0 and LibVEX; rerun with -h for copyright info
>> ==16123== Command: ./ex56 -ne 9 -alpha 1.e-3 -pc_type gamg -pc_gamg_type agg
>> -pc_gamg_agg_nsmooths 1 -pc_gamg_coarse_eq_limit 10
>> -pc_gamg_reuse_interpolation true -two_solves -ksp_monitor_short
>> -use_mat_nearnullspace
>> ==16123==
>> vex x86->IR: unhandled instruction bytes: 0x66 0xF 0x38 0x39
>> ==16123== valgrind: Unrecognised instruction at address 0x4228928.
>> ==16123==    at 0x4228928: ISCreateGeneral_Private (in
>> /home/stali/petsc-dev/arch-linux2-c-debug/lib/libpetsc.so.3.04.4)
>> ==16123==    by 0x4228D54: ISGeneralSetIndices_General (in
>> /home/stali/petsc-dev/arch-linux2-c-debug/lib/libpetsc.so.3.04.4)
>> ==16123==    by 0x4229504: ISGeneralSetIndices (in
>> /home/stali/petsc-dev/arch-linux2-c-debug/lib/libpetsc.so.3.04.4)
>> ==16123==    by 0x422976F: ISCreateGeneral (in
>> /home/stali/petsc-dev/arch-linux2-c-debug/lib/libpetsc.so.3.04.4)
>> ==16123==    by 0x4A94CAA: PCGAMGCoarsen_AGG (in
>> /home/stali/petsc-dev/arch-linux2-c-debug/lib/libpetsc.so.3.04.4)
>> ==16123==    by 0x4A84FD6: PCSetUp_GAMG (in
>> /home/stali/petsc-dev/arch-linux2-c-debug/lib/libpetsc.so.3.04.4)
>> ==16123==    by 0x49E8163: PCSetUp (in
>> /home/stali/petsc-dev/arch-linux2-c-debug/lib/libpetsc.so.3.04.4)
>> ==16123==    by 0x4AE6023: KSPSetUp (in
>> /home/stali/petsc-dev/arch-linux2-c-debug/lib/libpetsc.so.3.04.4)
>> ==16123==    by 0x804C7D6: main (in
>> /home/stali/petsc-dev/src/ksp/ksp/examples/tutorials/ex56)
>> ==16123== Your program just tried to execute an instruction that Valgrind
>> ==16123== did not recognise.  There are two possible reasons for this.
>> ==16123== 1. Your program has a bug and erroneously jumped to a non-code
>> ==16123==    location.  If you are running Memcheck and you just saw a
>> ==16123==    warning about a bad jump, it's probably your program's fault.
>> ==16123== 2. The instruction is legitimate but Valgrind doesn't handle it,
>> ==16123==    i.e. it's Valgrind's fault.  If you think this is the case or
>> ==16123==    you are not sure, please let us know and we'll try to fix it.
>> ==16123== Either way, Valgrind will now raise a SIGILL signal which will
>> ==16123== probably kill your program.
>> [0]PETSC ERROR:
>> ------------------------------------------------------------------------
>> [0]PETSC ERROR: Caught signal number 4 Illegal instruction: Likely due to
>> memory corruption
>> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>> [0]PETSC ERROR: or see
>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC ERROR: or
>> try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory
>> corruption errors
>> [0]PETSC ERROR: likely location of problem given in stack below
>> [0]PETSC ERROR: ---------------------  Stack Frames
>> ------------------------------------
>> [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
>> [0]PETSC ERROR:       INSTEAD the line number of the start of the function
>> [0]PETSC ERROR:       is given.
>> [0]PETSC ERROR: [0] ISCreateGeneral_Private line 575
>> /home/stali/petsc-dev/src/vec/is/is/impls/general/general.c
>> [0]PETSC ERROR: [0] ISGeneralSetIndices_General line 674
>> /home/stali/petsc-dev/src/vec/is/is/impls/general/general.c
>> [0]PETSC ERROR: [0] ISGeneralSetIndices line 662
>> /home/stali/petsc-dev/src/vec/is/is/impls/general/general.c
>> [0]PETSC ERROR: [0] ISCreateGeneral line 631
>> /home/stali/petsc-dev/src/vec/is/is/impls/general/general.c
>> [0]PETSC ERROR: [0] PCGAMGCoarsen_AGG line 976
>> /home/stali/petsc-dev/src/ksp/pc/impls/gamg/agg.c
>> [0]PETSC ERROR: [0] PCSetUp_GAMG line 487
>> /home/stali/petsc-dev/src/ksp/pc/impls/gamg/gamg.c
>> [0]PETSC ERROR: [0] KSPSetUp line 219
>> /home/stali/petsc-dev/src/ksp/ksp/interface/itfunc.c
>> [0]PETSC ERROR: --------------------- Error Message
>> --------------------------------------------------------------
>> [0]PETSC ERROR: Signal received
>> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for
>> trouble shooting.
>> [0]PETSC ERROR: Petsc Development GIT revision: v3.4.4-4344-ge0d8a6f  GIT
>> Date: 2014-05-21 16:02:44 -0500
>> [0]PETSC ERROR: ./ex56 on a arch-linux2-c-debug named i5 by stali Wed May 21
>> 16:41:07 2014
>> [0]PETSC ERROR: Configure options --with-fc=gfortran --with-cc=gcc
>> --download-mpich --with-metis=1 --download-metis=1 --COPTFLAGS="-O3
>> -march=native" --FOPTFLAGS="-O3 -march=native" --with-shared-libraries
>> --with-debugging=1
>> [0]PETSC ERROR: #1 User provided function() line 0 in  unknown file
>> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
>> [unset]: aborting job:
>> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
>> ==16123==
>> ==16123== HEAP SUMMARY:
>> ==16123==     in use at exit: 4,627,684 bytes in 1,188 blocks
>> ==16123==   total heap usage: 1,649 allocs, 461 frees, 6,073,192 bytes
>> allocated
>> ==16123==
>> ==16123== LEAK SUMMARY:
>> ==16123==    definitely lost: 0 bytes in 0 blocks
>> ==16123==    indirectly lost: 0 bytes in 0 blocks
>> ==16123==      possibly lost: 0 bytes in 0 blocks
>> ==16123==    still reachable: 4,627,684 bytes in 1,188 blocks
>> ==16123==         suppressed: 0 bytes in 0 blocks
>> ==16123== Rerun with --leak-check=full to see details of leaked memory
>> ==16123==
>> ==16123== For counts of detected and suppressed errors, rerun with: -v
>> ==16123== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 61 from 8)
>>
>>


More information about the petsc-users mailing list