[petsc-users] Caught signal number 11 SEGV

Francesco Brarda brardafrancesco at gmail.com
Wed Feb 24 03:39:43 CST 2021


I have never used gdb. 
Using 0 as you suggested I got this output:

$PETSC_DIR/$PETSC_ARCH/bin/mpirun -n 2 examples/rosenbrock/rosenbrock optimize -start_in_debugger noxterm -debugger_nodes 0
** PETSc DEPRECATION WARNING ** : the option -debugger_nodes is deprecated as of version 3.14 and will be removed in a future release. Please use the option -debugger_ranks instead. (Silence this warning with -options_suppress_deprecated_warnings)
PETSC: Attaching gdb to examples/rosenbrock/rosenbrock of pid 3903 on srvulx13
GNU gdb (Ubuntu 7.7.1-0ubuntu5~14.04.3) 7.7.1
Copyright (C) 2014 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from examples/rosenbrock/rosenbrock...done.
Attaching to program: /home/fbrarda/cmdstan-petsc/examples/rosenbrock/rosenbrock, process 3903
Could not attach to process.  If your uid matches the uid of the target
process, check the setting of /proc/sys/kernel/yama/ptrace_scope, or try
again as the root user.  For more details, see /etc/sysctl.d/10-ptrace.conf
ptrace: Operation not permitted.
/home/fbrarda/cmdstan-petsc/3903: No such file or directory.
(gdb) method = optimize
  optimize
    algorithm = lbfgs (Default)
      lbfgs
method = optimize
  optimize
    algorithm = lbfgs (Default)
      lbfgs
        init_alpha = 0.001 (Default)        init_alpha = 0.001 (Default)
        tol_obj = 9.9999999999999998e-13 (Default)
        tol_rel_obj = 10000 (Default)
        tol_grad = 1e-08 (Default)
        tol_rel_grad = 10000000 (Default)
        tol_obj = 9.9999999999999998e-13 (Default)
        tol_rel_obj = 10000 (Default)

        tol_param = 1e-08 (Default)
        history_size = 5 (Default)
        tol_grad = 1e-08 (Default)
    iter = 2000 (Default)
    save_iterations = 0 (Default)        tol_rel_grad = 10000000 (Default)
        tol_param = 1e-08 (Default)
        history_size = 5 (Default)
id = 0 (Default)
data
  file =  (Default)
init = 2 (Default)

    iter = 2000 (Default)
    save_iterations = 0 (Default)
random
  seed = 3666155654 (Default)
output
  file = output.csv (Default)
  diagnostic_file =  (Default)id = 0 (Default)
data
  file =  (Default)
init = 2 (Default)

  refresh = 100 (Default)

random
  seed = 3666155654 (Default)
output
  file = output.csv (Default)
  diagnostic_file =  (Default)
  refresh = 100 (Default)

Initial log joint probability = -158.559
    Iter      log prob        ||dx||      ||grad||       alpha      alpha0  # evals  Notes 
      12     -0.253535   0.000284499     0.0383658       0.001       0.001       46  LS failed, Hessian reset 
      13     -0.253535   0.000284499     0.0383658       6.528       0.001      111  LS failed, Hessian reset 
Optimization terminated with error: 
  Line search failed to achieve a sufficient decrease, no more progress can be made
[0]PETSC ERROR: PetscAbortErrorHandler: main() line 12 in src/cmdstan/main.cpp  
  To prevent termination, change the error handler using PetscPushErrorHandler()

Using only 1 process the code works. 

Francesco

> Il giorno 24 feb 2021, alle ore 07:14, Barry Smith <bsmith at petsc.dev> ha scritto:
> 
> 
> start_in_debugger noxterm -debugger_nodes 3
> 
> Use -start_in_debugger noxterm -debugger_nodes 0
> 
> when not opening windows for each debugger it is best to have the first rank associated with the tty as the debugger node
> 
> 
> 
> 
> 
> 
>> On Feb 23, 2021, at 3:46 PM, Francesco Brarda <brardafrancesco at gmail.com> wrote:
>> 
>> Using the command you suggested I got 
>> 
>> fbrarda at srvulx13:~/cmdstan-petsc$ $PETSC_DIR/$PETSC_ARCH/bin/mpirun -n 2 examples/rosenbrock/rosenbrock optimize -start_in_debugger noxterm -debugger_nodes 3
>> ** PETSc DEPRECATION WARNING ** : the option -debugger_nodes is deprecated as of version 3.14 and will be removed in a future release. Please use the option -debugger_ranks instead. (Silence this warning with -options_suppress_deprecated_warnings)
>> method = optimize
>>   optimize
>>     algorithm = lbfgs (Default)
>>       lbfgs
>> method = optimize
>>   optimize
>>     algorithm = lbfgs (Default)
>>       lbfgs
>>         init_alpha = 0.001 (Default)
>>         tol_obj = 9.9999999999999998e-13 (Default)
>>         init_alpha = 0.001 (Default)
>>         tol_obj = 9.9999999999999998e-13 (Default)
>>         tol_rel_obj = 10000 (Default)
>>         tol_grad = 1e-08 (Default)        tol_rel_obj = 10000 (Default)
>>         tol_grad = 1e-08 (Default)
>>         tol_rel_grad = 10000000 (Default)
>> 
>>         tol_rel_grad = 10000000 (Default)
>>         tol_param = 1e-08 (Default)        tol_param = 1e-08 (Default)
>>         history_size = 5 (Default)
>>     iter = 2000 (Default)
>> 
>>         history_size = 5 (Default)
>>     iter = 2000 (Default)
>>     save_iterations = 0 (Default)
>> id = 0 (Default)
>> data
>>     save_iterations = 0 (Default)
>> id = 0 (Default)
>> data
>>   file =  (Default)
>>   file =  (Default)
>> init = 2 (Default)
>> random
>>   seed = 3623621468 (Default)
>> output
>>   file = output.csv (Default)init = 2 (Default)
>> random
>>   seed = 3623621468 (Default)
>> output
>>   file = output.csv (Default)
>> 
>>   diagnostic_file =  (Default)
>>   refresh = 100 (Default)
>> 
>>   diagnostic_file =  (Default)
>>   refresh = 100 (Default)
>> 
>> Initial log joint probability = -195.984
>>     Iter      log prob        ||dx||      ||grad||       alpha      alpha0  # evals  Notes 
>>       10      -0.97101    0.00292919       1.65855       0.001       0.001       46  LS failed, Hessian reset 
>>       12     -0.483952      0.001316       1.18542       0.001       0.001       77  LS failed, Hessian reset 
>>       13     -0.477916     0.0118542      0.163518        0.01       0.001      106  LS failed, Hessian reset 
>> [1]PETSC ERROR: #1 main() line 12 in src/cmdstan/main.cpp
>> [1]PETSC ERROR: PETSc Option Table entries:
>> [1]PETSC ERROR: -debugger_nodes 3
>> [1]PETSC ERROR: -start_in_debugger noxterm
>> [1]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov—————
>>  
>> And then it does not go further. With the -debugger_ranks suggested, the output is the same. What do you think, please?
>> I am using a cluster (one node, dual-socket system with twelve-core-CPUs), but when I do the ssh I do not use the -X flag, if that's what you mean.
>> 
>> Thank you,
>> Francesco
>> 
>> 
>>> Il giorno 23 feb 2021, alle ore 21:59, Matthew Knepley <knepley at gmail.com> ha scritto:
>>> 
>>> On Tue, Feb 23, 2021 at 3:55 PM Francesco Brarda <brardafrancesco at gmail.com> wrote:
>>> Thank you for the quick response. 
>>> Sorry, you are right. Here is the complete output:
>>> 
>>> fbrarda at srvulx13:~/cmdstan-petsc$ $PETSC_DIR/$PETSC_ARCH/bin/mpirun -n 2 examples/rosenbrock/rosenbrock optimize -start_in_debugger
>>> PETSC: Attaching gdb to examples/rosenbrock/rosenbrock of pid 47803 on display :0.0 on machine srvulx13
>>> PETSC: Attaching gdb to examples/rosenbrock/rosenbrock of pid 47804 on display :0.0 on machine srvulx13
>>> xterm: Xt error: Can't open display: :0.0
>>> xterm: DISPLAY is not set
>>> xterm: Xt error: Can't open display: :0.0
>>> xterm: DISPLAY is not set
>>> 
>>> Do you have an Xserver running? If not, you can use
>>> 
>>>   -start_in_debugger noxterm -debugger_nodes 3
>>> 
>>> and try to get a stack trace from one node.
>>> 
>>>   Thanks,
>>> 
>>>     Matt
>>>  
>>> method = optimize
>>>   optimize
>>>     algorithm = lbfgs (Default)
>>>       lbfgs
>>> method = optimize
>>>   optimize
>>>     algorithm = lbfgs (Default)
>>>       lbfgs
>>>         init_alpha = 0.001 (Default)
>>>         tol_obj = 9.9999999999999998e-13 (Default)
>>>         tol_rel_obj = 10000 (Default)
>>>         tol_grad = 1e-08 (Default)
>>>         init_alpha = 0.001 (Default)
>>>         tol_obj = 9.9999999999999998e-13 (Default)
>>>         tol_rel_obj = 10000 (Default)
>>>         tol_grad = 1e-08 (Default)
>>>         tol_rel_grad = 10000000 (Default)
>>>         tol_param = 1e-08 (Default)
>>>         history_size = 5 (Default)
>>>         tol_rel_grad = 10000000 (Default)
>>>         tol_param = 1e-08 (Default)
>>>         history_size = 5 (Default)
>>>     iter = 2000 (Default)
>>>     iter = 2000 (Default)
>>>     save_iterations = 0 (Default)
>>> id = 0 (Default)
>>> data    save_iterations = 0 (Default)
>>> id = 0 (Default)
>>> data
>>>   file =  (Default)
>>> 
>>>   file =  (Default)
>>> init = 2 (Default)
>>> random
>>>   seed = 3585768430 (Default)
>>> init = 2 (Default)
>>> random
>>>   seed = 3585768430 (Default)
>>> output
>>>   file = output.csv (Default)
>>> output
>>>   file = output.csv (Default)
>>>   diagnostic_file =  (Default)
>>>   refresh = 100 (Default)
>>>   diagnostic_file =  (Default)
>>>   refresh = 100 (Default)
>>> 
>>> 
>>> Initial log joint probability = -731.444
>>>     Iter      log prob        ||dx||      ||grad||       alpha      alpha0  # evals  Notes 
>>> [1]PETSC ERROR: PetscAbortErrorHandler: main() line 12 in src/cmdstan/main.cpp  
>>>   To prevent termination, change the error handler using PetscPushErrorHandler()
>>> 
>>> ===================================================================================
>>> =   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
>>> =   PID 47804 RUNNING AT srvulx13
>>> =   EXIT CODE: 134
>>> =   CLEANING UP REMAINING PROCESSES
>>> =   YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
>>> ===================================================================================
>>> YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Aborted (signal 6)
>>> This typically refers to a problem with your application.
>>> Please see the FAQ page for debugging suggestions
>>> 
>>> 
>>> 
>>> 
>>> 
>>> The code inside main.cpp is the following:
>>> 
>>> #include <cmdstan/command.hpp>
>>> #include <stan/services/error_codes.hpp>
>>> 
>>> #include <petsc.h>
>>> 
>>> int main(int argc, char* argv[]) {
>>> 
>>>   PetscErrorCode ierr;
>>>   ierr = PetscInitialize(&argc, &argv, 0, 0);CHKERRQ(ierr);
>>> 
>>>   try {
>>>     ierr = cmdstan::command(argc, argv);CHKERRQ(ierr);
>>>   } catch (const std::exception& e) {
>>>     std::cout << e.what() << std::endl;
>>>     ierr = stan::services::error_codes::SOFTWARE;CHKERRQ(ierr);
>>>   }
>>> 
>>>   ierr = PetscFinalize();CHKERRQ(ierr);
>>>   return ierr;
>>> }
>>> 
>>> I highlighted the line 12. Although I read the page where the command PetscPushErrorHandler is explained and the example provided (src/ksp/ksp/tutorials/ex27.c), I do not understand how I should effectively use the command.
>>> Should I change the entire try/catch with PetscPushErrorHandler(PetscIgnoreErrorHandler,NULL); ?
>>> 
>>> Best,
>>> Francesco
>>> 
>>> 
>>>> Il giorno 23 feb 2021, alle ore 11:54, Matthew Knepley <knepley at gmail.com> ha scritto:
>>>> 
>>>> On Tue, Feb 23, 2021 at 3:54 AM Francesco Brarda <brardafrancesco at gmail.com> wrote:
>>>> Hi!
>>>> 
>>>> I am very new to the PETSc world. I am working with a GitHub repo that uses PETSc together with Stan (a statistics open source software), here you can find the discussion. 
>>>> It has been defined a functor to convert EigenVector to PetscVec and viceversa, both sequentially and in parallel. 
>>>> The file using these functions does the conversions with the sequential setting. I changed to those using MPI, that is from EigenVectorToPetscVecSeq to EigenVectorToPetscVecMPI and so on because I want to evaluate the scaling.
>>>> Running the example with mpirun -n 5 examples/rosenbrock/rosenbrock optimize in the debug mode I get the error Caught signal number 11 SEGV. I therefore used the option -start_in_debugger and I get the following:
>>>> 
>>>> For some reason, the -start_in_debuggger option is not being seen. Are you showing all the output? Once the debugger is attached,
>>>> you run the program (conr) and then when you hit the SEGV you get a stack trace (where).
>>>> 
>>>>   THanks,
>>>> 
>>>>     Matt
>>>>  
>>>> [2]PETSC ERROR: ------------------------------------------------------------------------
>>>> [2]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range
>>>> [2]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>>>> [2]PETSC ERROR: or see https://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>>>> [2]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
>>>> [2]PETSC ERROR: likely location of problem given in stack below
>>>> [2]PETSC ERROR: ---------------------  Stack Frames ------------------------------------
>>>> [2]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
>>>> [2]PETSC ERROR:       INSTEAD the line number of the start of the function
>>>> [2]PETSC ERROR:       is given.
>>>> [3]PETSC ERROR: ------------------------------------------------------------------------
>>>> [3]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range
>>>> [3]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>>>> [3]PETSC ERROR: or see https://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>>>> [3]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
>>>> [3]PETSC ERROR: likely location of problem given in stack below
>>>> [3]PETSC ERROR: ---------------------  Stack Frames ------------------------------------
>>>> [3]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
>>>> [3]PETSC ERROR:       INSTEAD the line number of the start of the function
>>>> [3]PETSC ERROR:       is given.
>>>> [3]PETSC ERROR: PetscAbortErrorHandler: User provided function() line 0 in  unknown file (null)
>>>>   To prevent termination, change the error handler using PetscPushErrorHandler()
>>>> [2]PETSC ERROR: PetscAbortErrorHandler: User provided function() line 0 in  unknown file (null)
>>>>   To prevent termination, change the error handler using PetscPushErrorHandler()
>>>> 
>>>> ===================================================================================
>>>> =   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
>>>> =   PID 22939 RUNNING AT srvulx13
>>>> =   EXIT CODE: 134
>>>> =   CLEANING UP REMAINING PROCESSES
>>>> =   YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
>>>> ===================================================================================
>>>> YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Aborted (signal 6)
>>>> This typically refers to a problem with your application.
>>>> Please see the FAQ page for debugging suggestions
>>>> 
>>>> I read the documentation regarding the PetscAbortErrorHandler, but I do not know where should I use it. How can I solve the problem? 
>>>> I hope I have been clear enough.
>>>> Attached you can find also my configure.log and make.log files.
>>>> 
>>>> Best,
>>>> Francesco
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> -- 
>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>>>> -- Norbert Wiener
>>>> 
>>>> https://www.cse.buffalo.edu/~knepley/
>>> 
>>> 
>>> 
>>> -- 
>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>>> -- Norbert Wiener
>>> 
>>> https://www.cse.buffalo.edu/~knepley/
>> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210224/7a67d216/attachment-0001.html>


More information about the petsc-users mailing list