[petsc-users] Debugger question

Satish Balay balay at mcs.anl.gov
Tue Apr 3 10:02:47 CDT 2012


On Tue, 3 Apr 2012, Anton Popov wrote:

> I support 100% what Barry said. Just get the work done. Cray and IBM Linux
> systems do not support ALL the systems calls that PETSc uses. So it's always
> kind of problem to purge manually petscconf.h in between of "configure" and
> "make" on their machines. I wander how you could install any PETSc without
> modifying petscconf.h.

You shouldn't have to manually modify petscconf.h on these machines. There
could still be some warnings at link time - but that shouldn't mean breakages
at tuntime.

> If you just don't care, usually you get segfaults right
> at PetscInitialize() step.

If this is the case - then it should be verifyiable with PETSc examples.

Satish

> Literally it means, there is no way you can debug
> anything, they should reinstall PETSc, keeping in mind the exact list of
> system calls they support, and PETSc requirements.
> 
> By the way, the times when GNU compilers were "order of magnitude" slower than
> "vendor compilers" have passed long ago. Just give it a try, compile some
> simple computationally intensive code with gcc and something from "vendor"
> with aggressive optimization, and check execution time on a large data set.
> I'm sure you'll be surprised.
> 
> Cheers,
> Anton
> 
> On 4/3/12 3:57 AM, Barry Smith wrote:
> > On Apr 2, 2012, at 8:10 PM, Tabrez Ali wrote:
> > 
> > > Hello
> > > 
> > > I am trying to debug a program using the switch
> > > '-on_error_attach_debugger' but  the vendor/sysadmin built PETSc 3.2.00 is
> > > unable to start the debugger in xterm (see text below). But xterm is
> > > installed. What am I doing wrong?
> > > 
> > > Btw the segfault happens during a call to MatMult but only with
> > > vendor/sysadmin supplied PETSc 3.2 with PGI and Intel compilers only and
> > > _not_ with CRAY or GNU compilers.
> >     My advice, blow off "the vendor/sysadmin supplied PETSc 3.2" and just
> > built it yourself so you can get real work done instead of trying to debug
> > their mess.   I promise the vendor one is not like a billion times faster or
> > anything.
> > 
> >     Barry
> > 
> > 
> > 
> > > I also dont get the segfault if I build PETSc 3.2-p7 myself with PGI/Intel
> > > compilers.
> > > 
> > > Any ideas on how to diagnose the problem? Unfortunately I cannot seem to
> > > run valgrind on this particular machine.
> > > 
> > > Thanks in advance.
> > > 
> > > Tabrez
> > > 
> > > ---
> > > 
> > > stali at krakenpf1:~/meshes>  which xterm
> > > /usr/bin/xterm
> > > stali at krakenpf1:~/meshes>  aprun -n 1 ./defmod -f
> > > 2d_point_load_dyn_abc.inp -on_error_attach_debugger
> > > ...
> > > ...
> > > ...
> > > [0]PETSC ERROR:
> > > ------------------------------------------------------------------------
> > > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
> > > probably memory access out of range
> > > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
> > > [0]PETSC ERROR: or see
> > > http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.html#valgrind[0]PETSC
> > > ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find
> > > memory corruption errors
> > > [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and
> > > run
> > > [0]PETSC ERROR: to get more information on the crash.
> > > [0]PETSC ERROR: User provided function() line 0 in unknown directory
> > > unknown file
> > > [0]PETSC ERROR: PETSC: Attaching gdb to ./defmod of pid 32384 on display
> > > localhost:20.0 on machine nid10649
> > > Unable to start debugger in xterm: No such file or directory
> > > aborting job:
> > > application called MPI_Abort(MPI_COMM_WORLD, 0) - process 0
> > > _pmii_daemon(SIGCHLD): [NID 10649] [c23-3c0s6n1] [Mon Apr  2 13:06:48
> > > 2012] PE 0 exit signal Aborted
> > > Application 133198 exit codes: 134
> > > Application 133198 resources: utime ~1s, stime ~0s
> 
> 



More information about the petsc-users mailing list