[petsc-users] Debugger question

Matthew Knepley knepley at gmail.com
Mon Apr 2 22:12:03 CDT 2012


On Mon, Apr 2, 2012 at 9:52 PM, Tabrez Ali <stali at geology.wisc.edu> wrote:

>  Matt/Barry
>
> My intention was to make sure that the code is bug free and since PETSc
> was pre-installed on the cluster with various compilers it was easier to
> test quickly rather than build all combinations myself. Performance is of
> absolutely no concern.
>
> Things were working fine with 3.1 but recently the OS (Cray Linux Env) was
> upgraded and so was PETSc (to 3.2).
>
> Matt
>
> I am attaching entire output.
>

> Unable to start debugger in xterm: No such file or directory
> aborting job:

xterm is not in the path.

   Matt


> Tabrez
>
> ---
>
> stali at krakenpf2:~/meshes> which xterm
> /usr/bin/xterm
> stali at krakenpf2:~/meshes> aprun -n 1 ./defmod -f
> 2d_point_load_dyn_abc.inp -on_error_attach_debugger
>  Reading input ...
>  Reading mesh data ...
>  Forming [K] ...
>  Forming [M] & [M]^-1 ...
>  Applying constraints ...
>  Forming RHS ...
>  Setting up solver ...
>  Solving ...
>   Time Step 0
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
> probably memory access out of range
> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
> [0]PETSC ERROR: or see
> http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.html#valgrind[0]PETSC
> ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find
> memory corruption errors
> [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and
> run
> [0]PETSC ERROR: to get more information on the crash.
> [0]PETSC ERROR: User provided function() line 0 in unknown directory
> unknown file
> [0]PETSC ERROR: PETSC: Attaching gdb to ./defmod of pid 26164 on display
> :0.0 on machine nid03538
> Unable to start debugger in xterm: No such file or directory
> aborting job:
> application called MPI_Abort(MPI_COMM_WORLD, 0) - process 0
> _pmii_daemon(SIGCHLD): [NID 03538] [c12-3c2s4n2] [Mon Apr  2 22:50:09
> 2012] PE 0 exit signal Aborted
> Application 134950 exit codes: 134
> Application 134950 resources: utime ~1s, stime ~0s
>
> On 04/02/2012 09:04 PM, Matthew Knepley wrote:
>
> On Mon, Apr 2, 2012 at 8:57 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>
>>
>> On Apr 2, 2012, at 8:10 PM, Tabrez Ali wrote:
>>
>> > Hello
>> >
>> > I am trying to debug a program using the switch
>> '-on_error_attach_debugger' but  the vendor/sysadmin built PETSc 3.2.00 is
>> unable to start the debugger in xterm (see text below). But xterm is
>> installed. What am I doing wrong?
>> >
>> > Btw the segfault happens during a call to MatMult but only with
>> vendor/sysadmin supplied PETSc 3.2 with PGI and Intel compilers only and
>> _not_ with CRAY or GNU compilers.
>>
>>    My advice, blow off "the vendor/sysadmin supplied PETSc 3.2" and just
>> built it yourself so you can get real work done instead of trying to debug
>> their mess.   I promise the vendor one is not like a billion times faster
>> or anything.
>
>
>  If you want to justify this to anyone (like a funder), just run both on
> ex5 for a large size and look at the flops on MatMult. That
> is probably your dominant cost (or your PC).
>
>     Matt
>
>
>>
>>   Barry
>>
>>
>>
>> >
>> > I also dont get the segfault if I build PETSc 3.2-p7 myself with
>> PGI/Intel compilers.
>> >
>> > Any ideas on how to diagnose the problem? Unfortunately I cannot seem
>> to run valgrind on this particular machine.
>> >
>> > Thanks in advance.
>> >
>> > Tabrez
>> >
>> > ---
>> >
>> > stali at krakenpf1:~/meshes> which xterm
>> > /usr/bin/xterm
>> > stali at krakenpf1:~/meshes> aprun -n 1 ./defmod -f
>> 2d_point_load_dyn_abc.inp -on_error_attach_debugger
>> > ...
>> > ...
>> > ...
>> > [0]PETSC ERROR:
>> ------------------------------------------------------------------------
>> > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
>> probably memory access out of range
>> > [0]PETSC ERROR: Try option -start_in_debugger or
>> -on_error_attach_debugger
>> > [0]PETSC ERROR: or see
>> http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.html#valgrind[0]PETSCERROR: or try
>> http://valgrind.org on GNU/linux and Apple Mac OS X to find memory
>> corruption errors
>> > [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link,
>> and run
>> > [0]PETSC ERROR: to get more information on the crash.
>> > [0]PETSC ERROR: User provided function() line 0 in unknown directory
>> unknown file
>> > [0]PETSC ERROR: PETSC: Attaching gdb to ./defmod of pid 32384 on
>> display localhost:20.0 on machine nid10649
>> > Unable to start debugger in xterm: No such file or directory
>> > aborting job:
>> > application called MPI_Abort(MPI_COMM_WORLD, 0) - process 0
>> > _pmii_daemon(SIGCHLD): [NID 10649] [c23-3c0s6n1] [Mon Apr  2 13:06:48
>> 2012] PE 0 exit signal Aborted
>> > Application 133198 exit codes: 134
>> > Application 133198 resources: utime ~1s, stime ~0s
>>
>>
>
>
>  --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20120402/93e0a49c/attachment-0001.htm>


More information about the petsc-users mailing list