[petsc-users] debugger fails
Dominik Szczerba
dominik at itis.ethz.ch
Fri Aug 19 14:16:15 CDT 2011
Many thanks for the hint. Hitting "c" never returns the prompt,
hitting "ctrl+c" and then "where" reveals a deadlock. Thanks again!
Dominik
On Fri, Aug 19, 2011 at 7:28 PM, Aron Ahmadia <aron.ahmadia at kaust.edu.sa> wrote:
> The debugger stops when you start up, that's this code [1]. Then you want
> to hit 'continue' so your job runs normally to where it fails. You can also
> set a break point on PetscError since PETSc is catching the error from MPI.
> When you stop at the 'second breakpoint', you'll be at the part where your
> code has detected an error condition in MPI. Type a 'where' there to get
> the stack when the error was detected.
> [1]
> (gdb) where
> #0 0x00007fae5b941590 in __nanosleep_nocancel () at
> ../sysdeps/unix/syscall-template.S:82
> #1 0x00007fae5b94143c in __sleep (seconds=0) at
> ../sysdeps/unix/sysv/linux/sleep.c:138
> #2 0x000000000056cc48 in PetscSleep (s=10) at psleep.c:56
> #3 0x0000000000838887 in PetscAttachDebugger () at adebug.c:410
> #4 0x00000000005590a7 in PetscOptionsCheckInitial_Private () at init.c:392
> #5 0x000000000055e40e in PetscInitialize (argc=0x7ffff403debc,
> args=0x7ffff403deb0, file=0x0,
> help=0x0) at pinit.c:639
> #6 0x0000000000524a16 in PetscSolver::InitializePetsc
> (argc=0x7ffff403debc, argv=0x7ffff403deb0)
> at /home/dsz/src/framework/trunk/solve/PetscSolver.cxx:124
> #7 0x00000000004c404f in main (argc=4, argv=0x7ffff403e4c8)
> at /home/dsz/src/framework/trunk/solve/cd3t10mpi_main.cxx:526
> (gdb)
>
>
>
> On Fri, Aug 19, 2011 at 8:22 PM, Dominik Szczerba <dominik at itis.ethz.ch>
> wrote:
>>
>> What do you mean by "the second break"?
>>
>> Dominik
>>
>> On Fri, Aug 19, 2011 at 6:47 PM, Aron Ahmadia <aron.ahmadia at kaust.edu.sa>
>> wrote:
>> > You want to do a 'where' on the second break, when your program is
>> > raising
>> > an abort signal...
>> > A
>> >
>> > On Fri, Aug 19, 2011 at 6:57 PM, Dominik Szczerba <dominik at itis.ethz.ch>
>> > wrote:
>> >>
>> >> (gdb) where
>> >> #0 0x00007fae5b941590 in __nanosleep_nocancel () at
>> >> ../sysdeps/unix/syscall-template.S:82
>> >> #1 0x00007fae5b94143c in __sleep (seconds=0) at
>> >> ../sysdeps/unix/sysv/linux/sleep.c:138
>> >> #2 0x000000000056cc48 in PetscSleep (s=10) at psleep.c:56
>> >> #3 0x0000000000838887 in PetscAttachDebugger () at adebug.c:410
>> >> #4 0x00000000005590a7 in PetscOptionsCheckInitial_Private () at
>> >> init.c:392
>> >> #5 0x000000000055e40e in PetscInitialize (argc=0x7ffff403debc,
>> >> args=0x7ffff403deb0, file=0x0,
>> >> help=0x0) at pinit.c:639
>> >> #6 0x0000000000524a16 in PetscSolver::InitializePetsc
>> >> (argc=0x7ffff403debc, argv=0x7ffff403deb0)
>> >> at /home/dsz/src/framework/trunk/solve/PetscSolver.cxx:124
>> >> #7 0x00000000004c404f in main (argc=4, argv=0x7ffff403e4c8)
>> >> at /home/dsz/src/framework/trunk/solve/cd3t10mpi_main.cxx:526
>> >> (gdb)
>> >>
>> >> PetscSolver.cxx:124:
>> >>
>> >> ierr = PetscInitialize(argc, argv, (char *)0, (char *)0);
>> >> CHKERRQ(ierr);
>> >>
>> >> Hmmm, not very helpful.....
>> >>
>> >> The app runs on one cpu, but silently crashes on two.
>> >>
>> >> Any hints are very appreciated.
>> >>
>> >> Dominik
>> >>
>> >>
>> >>
>> >> On Fri, Aug 19, 2011 at 5:49 PM, Satish Balay <balay at mcs.anl.gov>
>> >> wrote:
>> >> > On Fri, 19 Aug 2011, Dominik Szczerba wrote:
>> >> >
>> >> >> Hi,
>> >> >>
>> >> >> I am starting my app in the debugger as:
>> >> >>
>> >> >> mpiexec -np 2 sm3t4mpi run.xml -start_in_debugger -display :0.0
>> >> >>
>> >> >> In the console I get:
>> >> >>
>> >> >> [1]PETSC ERROR: MPI error 14
>> >> >>
>> >> >> in the two open terminals with gdb I get:
>> >> >>
>> >> >> 0x00007f2ecdd15590 in __nanosleep_nocancel () at
>> >> >> ../sysdeps/unix/syscall-template.S:82
>> >> >> 82 ../sysdeps/unix/syscall-template.S: No such file or
>> >> >> directory.
>> >> >> in ../sysdeps/unix/syscall-template.S
>> >> >> (gdb)
>> >> >>
>> >> >>
>> >> >> I type 'c' nonetheless and see:
>> >> >>
>> >> >> (gdb) c
>> >> >> Continuing.
>> >> >> [New Thread 0x7f268e975700 (LWP 22388)]
>> >> >>
>> >> >> Program received signal SIGABRT, Aborted.
>> >> >> 0x00007f268f421d05 in raise (sig=6) at
>> >> >> ../nptl/sysdeps/unix/sysv/linux/raise.c:64
>> >> >> 64 ../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or
>> >> >> directory.
>> >> >> in ../nptl/sysdeps/unix/sysv/linux/raise.c
>> >> >>
>> >> >>
>> >> >>
>> >> >> How do I go on debugging?
>> >> >
>> >> > what do you get for:
>> >> >
>> >> > (gdb) where
>> >> >
>> >> > Satish
>> >> >
>> >> >
>> >> >>
>> >> >> Many thanks for any hints,
>> >> >>
>> >> >> Dominik
>> >> >>
>> >> >
>> >> >
>> >
>> >
>
>
More information about the petsc-users
mailing list