[MPICH] Tracing the mpich library with gdb and xtern

Krishna Chaitanya kris.c1986 at gmail.com
Thu Jan 3 13:22:09 CST 2008


Hi Darius,
              Thanks a lot for the clarification. It did help :).
Krishna Chaitanya K

On Jan 3, 2008 11:39 AM, Darius Buntinas <buntinas at mcs.anl.gov> wrote:

>
> It's easiest if you run both processes on the same machine, then the
> DISPLAY values will be correct.
>
> But if you need to use two machines, there are some tricks.
>
> Using ssh (this is secure and the way I do it).  First some background
> info.  Ssh can be configured to forward X traffic from a remote machine
> back to your display.  When you ssh into another machine you'll see that
> DISPLAY is set to something like "localhost:10.0".  Now any process
> (that's owned by you) on the remote machine can display an xwindow on
> your local display by sending it to "localhost:10.0".  Another thing to
> notice is that every new ssh session (whether it's your ssh session or
> someone else's) to that node gets a different value for DISPLAY (e.g.,
> "localhost:11.0").
>
> So what I do is open one ssh session to each of the remote machines my
> jobs will run on.  This sets up the X forwarding, and you need to keep
> these open as long as you want X to be forwarded.  Now, read the values
> of DISPLAY from each ssh session.  If they're the same, say
> localhost:10.0, it's easy:
>
>   mpiexec -n 2 -env DISPLAY localhost:10.0 xterm -e gdb ./cpi
>
> You should see two xterms open up, one from each remote machine, with
> gdb running.
>
> Now, if DISPLAY is not the same on both, then you'll have to set DISPLAY
> differently for each process:
>
>   mpiexec -n 1 -env DISPLAY localhost:10.0 xterm -e gdb ./cpi : \
>            n 1 -env DISPLAY localhost:11.0 xterm -e gdb ./cpi
>
> (Notice the colon (:) and the escaped linebreak).  The trick is to
> figure out which rank runs on which machine, so you use the right
> DISPLAY value on the right machine.  Of course with two processes, you
> can try it one way, and if it doesn't work flip the DISPLAY values and
> try again.  (alternatively you can check which rank is run on which
> machine like this:  "mpiexec -l -n 2 hostname").
>
> Note that you can run any X program this way.  I generally use ddd as
> the debugger instead of "xterm -e gdb"
>
> I hope this clarified more thatn it confused.
>
> -d
>
> On 01/03/2008 04:03 AM, Krishna Chaitanya wrote:
> > Hi,
> >        Thanks for the help, guess I complicated my question
> un-necessarily.
> >        I wish to run a program on two machines and  have two  debug
> > windows on my local machine, so that i can trace through the pt2pt code.
> > This must concern xterm and setting the display variable correctly. At
> > this stage, I have the DISPLAY set to 0.0 on both the machines and I am
> > ssh-ing into the remote machine by using the -X switch. The debug window
> > for the remote machine is getting launched at the remote terminal but is
> > not getting displayed on mine. Please let me know what needs to be done
> > to have the window displayed on my machine.
> >
> > Thanks,
> > Krishna Chaitanya K
> >
> >
> >
> >
> > On 1/2/08, *Darius Buntinas* <buntinas at mcs.anl.gov
> > <mailto:buntinas at mcs.anl.gov>> wrote:
> >
> >
> >     I'm not sure exactly what you want to do, so here are a few ideas.
> >
> >     If you want to start rank 0 before rank 1, you can change your test
> >     program so it calls mpi_commrank right after mpi_init.  Then set a
> >     breakpoint just after mpi_commrank, and when you hit the breakpoint,
> you
> >     can look at the rank and decide which one to continue running.
> >
> >     If you can't modify the test program, you can set the breakpoint
> just
> >     after mpi_init, then read MPIDI_Process.my_pg_rank.  You'll have to
> >     configure MPICH2 with --enable-g=dbg to get debugging symbols added
> (you
> >     might also want to configure with --disable-compiler-optimizations
> to
> >     remove the -O2 flag and make it easier to step through the code).
> >
> >     If you really need to know the rank of a process before mpi_init is
> >     called, you can read the PMI_RANK environment variable.  Note that
> >     mpi_init performs an implicit barrier, so if one process calls
> >     mpi_init,
> >     it won't exit until all other processes have called mpi_init, so
> even if
> >     you start one process before the others, it won't get past mpi_init.
> >
> >     Hope that helps,
> >     -d
> >
> >     On 12/31/2007 04:07 AM, Krishna Chaitanya wrote:
> >      > Hi,
> >      >                I have been tracing the flow of the mpich code by
> >      > executing a simple program having MPI_Send() and MPI_Recv(),on
> one
> >      > machine.  I have been using gdb along with xtern to have two
> windows
> >      > open at the same time as I step through the code. I wish to get a
> >     better
> >      > glimpse of the working of the point to point calls, by launching
> >     the job
> >      > on two machines and by tracing the flow in a similar manner.
> >     Could you
> >      > please tell me how I can go about it? I have a feeling that this
> >     can be
> >      > done by hard-coding the macros and telling the daemons directly
> >     that one
> >      > machine is rank1 and the other is rank2. That way, i can start
> rank1
> >      > process first and the rank2 process a little later and trace
> >     through the
> >      > code.
> >      >
> >      > Thanks,
> >      > Krishna Chaitanya,
> >      > Dept of Information Technology,
> >      > National Institute of Technology, Karnataka ( NITK )
> >      > India
> >      >
> >      > --
> >      > In the middle of difficulty, lies opportunity
> >
> >
> >
> >
> > --
> > In the middle of difficulty, lies opportunity
>



-- 
In the middle of difficulty, lies opportunity
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20080103/715966b4/attachment.htm>


More information about the mpich-discuss mailing list