[petsc-users] Debugger question

Tabrez Ali stali at geology.wisc.edu
Mon Apr 2 22:04:00 CDT 2012


Satish

Things work fine on my linux machine (and other linux clusters) and 
valgrind shows no error. Unfortunately Totalview (GUI starts fine on the 
node) gives me a licensing error on the Cray.

I will continue to explore.

Thanks
Tabrez

On 04/02/2012 09:05 PM, Satish Balay wrote:
> Sounds like a Cray machine.
>
> start_in_debugger is useful for debugging on workstations [or
> clusters] etc where there is some control on X11 tunnels. Also
> 'xterm','gdb' or similar debugger should be available on the compute
> nodes [along with a x/ssh tunnel].
>
> On a cray - you are better off looking for a parallel debugger. Don't
> know if cray has one available.
>
> Wrt debugging - you might want to run your code with valgrind on a
> linux box..
>
> Satish
>
> On Mon, 2 Apr 2012, Tabrez Ali wrote:
>
>> Hello
>>
>> I am trying to debug a program using the switch '-on_error_attach_debugger'
>> but  the vendor/sysadmin built PETSc 3.2.00 is unable to start the debugger in
>> xterm (see text below). But xterm is installed. What am I doing wrong?
>>
>> Btw the segfault happens during a call to MatMult but only with
>> vendor/sysadmin supplied PETSc 3.2 with PGI and Intel compilers only and _not_
>> with CRAY or GNU compilers.
>>
>> I also dont get the segfault if I build PETSc 3.2-p7 myself with PGI/Intel
>> compilers.
>>
>> Any ideas on how to diagnose the problem? Unfortunately I cannot seem to run
>> valgrind on this particular machine.
>>
>> Thanks in advance.
>>
>> Tabrez
>>
>> ---
>>
>> stali at krakenpf1:~/meshes>  which xterm
>> /usr/bin/xterm
>> stali at krakenpf1:~/meshes>  aprun -n 1 ./defmod -f 2d_point_load_dyn_abc.inp
>> -on_error_attach_debugger
>> ...
>> ...
>> ...
>> [0]PETSC ERROR:
>> ------------------------------------------------------------------------
>> [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably
>> memory access out of range
>> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>> [0]PETSC ERROR: or see
>> http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.html#valgrind[0]PETSC
>> ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find
>> memory corruption errors
>> [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run
>> [0]PETSC ERROR: to get more information on the crash.
>> [0]PETSC ERROR: User provided function() line 0 in unknown directory unknown
>> file
>> [0]PETSC ERROR: PETSC: Attaching gdb to ./defmod of pid 32384 on display
>> localhost:20.0 on machine nid10649
>> Unable to start debugger in xterm: No such file or directory
>> aborting job:
>> application called MPI_Abort(MPI_COMM_WORLD, 0) - process 0
>> _pmii_daemon(SIGCHLD): [NID 10649] [c23-3c0s6n1] [Mon Apr  2 13:06:48 2012] PE
>> 0 exit signal Aborted
>> Application 133198 exit codes: 134
>> Application 133198 resources: utime ~1s, stime ~0s
>>



More information about the petsc-users mailing list