[petsc-users] Debugger question

Tabrez Ali stali at geology.wisc.edu
Mon Apr 2 21:52:47 CDT 2012


Matt/Barry

My intention was to make sure that the code is bug free and since PETSc 
was pre-installed on the cluster with various compilers it was easier to 
test quickly rather than build all combinations myself. Performance is 
of absolutely no concern.

Things were working fine with 3.1 but recently the OS (Cray Linux Env) 
was upgraded and so was PETSc (to 3.2).

Matt

I am attaching entire output.

Tabrez

---

stali at krakenpf2:~/meshes> which xterm
/usr/bin/xterm
stali at krakenpf2:~/meshes> aprun -n 1 ./defmod -f 
2d_point_load_dyn_abc.inp -on_error_attach_debugger
  Reading input ...
  Reading mesh data ...
  Forming [K] ...
  Forming [M] & [M]^-1 ...
  Applying constraints ...
  Forming RHS ...
  Setting up solver ...
  Solving ...
   Time Step 0
[0]PETSC ERROR: 
------------------------------------------------------------------------
[0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, 
probably memory access out of range
[0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
[0]PETSC ERROR: or see 
http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.html#valgrind[0]PETSC 
ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to 
find memory corruption errors
[0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, 
and run
[0]PETSC ERROR: to get more information on the crash.
[0]PETSC ERROR: User provided function() line 0 in unknown directory 
unknown file
[0]PETSC ERROR: PETSC: Attaching gdb to ./defmod of pid 26164 on display 
:0.0 on machine nid03538
Unable to start debugger in xterm: No such file or directory
aborting job:
application called MPI_Abort(MPI_COMM_WORLD, 0) - process 0
_pmii_daemon(SIGCHLD): [NID 03538] [c12-3c2s4n2] [Mon Apr  2 22:50:09 
2012] PE 0 exit signal Aborted
Application 134950 exit codes: 134
Application 134950 resources: utime ~1s, stime ~0s

On 04/02/2012 09:04 PM, Matthew Knepley wrote:
> On Mon, Apr 2, 2012 at 8:57 PM, Barry Smith <bsmith at mcs.anl.gov 
> <mailto:bsmith at mcs.anl.gov>> wrote:
>
>
>     On Apr 2, 2012, at 8:10 PM, Tabrez Ali wrote:
>
>     > Hello
>     >
>     > I am trying to debug a program using the switch
>     '-on_error_attach_debugger' but  the vendor/sysadmin built PETSc
>     3.2.00 is unable to start the debugger in xterm (see text below).
>     But xterm is installed. What am I doing wrong?
>     >
>     > Btw the segfault happens during a call to MatMult but only with
>     vendor/sysadmin supplied PETSc 3.2 with PGI and Intel compilers
>     only and _not_ with CRAY or GNU compilers.
>
>       My advice, blow off "the vendor/sysadmin supplied PETSc 3.2" and
>     just built it yourself so you can get real work done instead of
>     trying to debug their mess.   I promise the vendor one is not like
>     a billion times faster or anything.
>
>
> If you want to justify this to anyone (like a funder), just run both 
> on ex5 for a large size and look at the flops on MatMult. That
> is probably your dominant cost (or your PC).
>
>    Matt
>
>
>       Barry
>
>
>
>     >
>     > I also dont get the segfault if I build PETSc 3.2-p7 myself with
>     PGI/Intel compilers.
>     >
>     > Any ideas on how to diagnose the problem? Unfortunately I cannot
>     seem to run valgrind on this particular machine.
>     >
>     > Thanks in advance.
>     >
>     > Tabrez
>     >
>     > ---
>     >
>     > stali at krakenpf1:~/meshes> which xterm
>     > /usr/bin/xterm
>     > stali at krakenpf1:~/meshes> aprun -n 1 ./defmod -f
>     2d_point_load_dyn_abc.inp -on_error_attach_debugger
>     > ...
>     > ...
>     > ...
>     > [0]PETSC ERROR:
>     ------------------------------------------------------------------------
>     > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation
>     Violation, probably memory access out of range
>     > [0]PETSC ERROR: Try option -start_in_debugger or
>     -on_error_attach_debugger
>     > [0]PETSC ERROR: or see
>     http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.html#valgrind[0]PETSC
>     ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X
>     to find memory corruption errors
>     > [0]PETSC ERROR: configure using --with-debugging=yes, recompile,
>     link, and run
>     > [0]PETSC ERROR: to get more information on the crash.
>     > [0]PETSC ERROR: User provided function() line 0 in unknown
>     directory unknown file
>     > [0]PETSC ERROR: PETSC: Attaching gdb to ./defmod of pid 32384 on
>     display localhost:20.0 on machine nid10649
>     > Unable to start debugger in xterm: No such file or directory
>     > aborting job:
>     > application called MPI_Abort(MPI_COMM_WORLD, 0) - process 0
>     > _pmii_daemon(SIGCHLD): [NID 10649] [c23-3c0s6n1] [Mon Apr  2
>     13:06:48 2012] PE 0 exit signal Aborted
>     > Application 133198 exit codes: 134
>     > Application 133198 resources: utime ~1s, stime ~0s
>
>
>
>
> -- 
> What most experimenters take for granted before they begin their 
> experiments is infinitely more interesting than any results to which 
> their experiments lead.
> -- Norbert Wiener

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20120402/2b63becd/attachment.htm>


More information about the petsc-users mailing list