[MPICH2-dev] mpiexec with gdb

Ralph Butler rbutler at mtsu.edu
Tue Oct 10 20:07:13 CDT 2006


On TueOct 10, at Tue Oct 10 4:48PM, Florin Isaila wrote:

> Hi,
>
> I am having a problem running mpiexec with gdb. I set a breakpoint  
> at a program line, but the program wouldnt stop there in case an  
> error occurs (o/w it  stops normally).  The error  can be a  
> segmentation fault  or a  call to MPI_Abort.
>
> This makes debugging impossible. Is the old style of starting each  
> mpi process in a separate debugging session possible?

I have tried running the pgm we see in your output in the same way  
you show and have included the output below.
However, many folks prefer to use ddd like this:
     mpiexec -n 2 ddd mpi_pgm

This will launch 2 ddd windows on the desktop each running mpi_pgm.    
It's pretty easy to do around 4 this way.

> While merging the output of several debuggers is helpful in some  
> cases, controlling each independent process is sometimes very  
> important.
>
> Here the simplest example with a forced segmentation fault. The  
> breakpoint at line 229 is ignored, even though the segmentation  
> fault occurs after. The gdb is also quited, without making clear  
> the source of error.
>
> stallion:~/tests/mpi/dtype % mpiexec -gdb -n 1 test
> 0:  (gdb) l 204
>
> 0:  204 void test_dt() {
> 0:  205   int *i = 0;
> 0:  206   *i = 1;
> 0:  209}
>
> 0:  (gdb) l 227
> 0:  227 int main(int argc, char* argv[]) {
> 0:  228   MPI_Init(&argc, &argv);
> 0:  229   test_dt();
> 0:  230   MPI_Finalize();
> 0:  231   return 0;
> 0:  232 }
>
> 0:  (gdb) b 229
> 0:  Breakpoint 2 at 0x8049f79: file test.c, line 229.
> 0:  (gdb) r
>  rank 0 in job 72  stallion.ece.northwestern.edu_42447   caused  
> collective abort of all ranks
>   exit status of rank 0: killed by signal 9
>
> Many thanks
> Florin

My run of the pgm:

(magpie:52) % mpiexec -gdb -n 1 temp
0:  (gdb) l
0:  1   void test_dt() {
0:  2       int *i = 0;
0:  3       *i = 1;
0:  4   }
0:  5
0:  6   int main(int argc, char* argv[]) {
0:  7       MPI_Init(&argc, &argv);
0:  8       test_dt();
0:  9       MPI_Finalize();
0:  10      return 0;
0:  (gdb) b 8
0:  Breakpoint 2 at 0x80495fe: file temp.c, line 8.
0:  (gdb) r
0:  Continuing.
0:
0:  Breakpoint 2, main (argc=1, argv=0xbffff3b4) at temp.c:8
0:  8       test_dt();
0:  (gdb) 0:  (gdb) s
0:  test_dt () at temp.c:2
0:  2       int *i = 0;
0:  (gdb) s
0:  3       *i = 1;
0:  (gdb) p *i
0:  Cannot access memory at address 0x0
0:  (gdb) p i
0:  $1 = (int *) 0x0
0:  (gdb) c
0:  Continuing.
0:
0:  Program received signal SIGSEGV, Segmentation fault.
0:  0x080495d4 in test_dt () at temp.c:3
0:  3       *i = 1;
0:  (gdb) where
0:  #0  0x080495d4 in test_dt () at temp.c:3
0:  #1  0x08049603 in main (argc=1, argv=0xbffff3b4) at temp.c:8
0:  (gdb) q
rank 0 in job 2  magpie_42682   caused collective abort of all ranks
   exit status of rank 0: killed by signal 9
(magpie:53) %




More information about the mpich2-dev mailing list