[MPICH] Tracing the mpich library with gdb and xtern
Krishna Chaitanya
kris.c1986 at gmail.com
Thu Jan 3 13:22:09 CST 2008
Hi Darius,
Thanks a lot for the clarification. It did help :).
Krishna Chaitanya K
On Jan 3, 2008 11:39 AM, Darius Buntinas <buntinas at mcs.anl.gov> wrote:
>
> It's easiest if you run both processes on the same machine, then the
> DISPLAY values will be correct.
>
> But if you need to use two machines, there are some tricks.
>
> Using ssh (this is secure and the way I do it). First some background
> info. Ssh can be configured to forward X traffic from a remote machine
> back to your display. When you ssh into another machine you'll see that
> DISPLAY is set to something like "localhost:10.0". Now any process
> (that's owned by you) on the remote machine can display an xwindow on
> your local display by sending it to "localhost:10.0". Another thing to
> notice is that every new ssh session (whether it's your ssh session or
> someone else's) to that node gets a different value for DISPLAY (e.g.,
> "localhost:11.0").
>
> So what I do is open one ssh session to each of the remote machines my
> jobs will run on. This sets up the X forwarding, and you need to keep
> these open as long as you want X to be forwarded. Now, read the values
> of DISPLAY from each ssh session. If they're the same, say
> localhost:10.0, it's easy:
>
> mpiexec -n 2 -env DISPLAY localhost:10.0 xterm -e gdb ./cpi
>
> You should see two xterms open up, one from each remote machine, with
> gdb running.
>
> Now, if DISPLAY is not the same on both, then you'll have to set DISPLAY
> differently for each process:
>
> mpiexec -n 1 -env DISPLAY localhost:10.0 xterm -e gdb ./cpi : \
> n 1 -env DISPLAY localhost:11.0 xterm -e gdb ./cpi
>
> (Notice the colon (:) and the escaped linebreak). The trick is to
> figure out which rank runs on which machine, so you use the right
> DISPLAY value on the right machine. Of course with two processes, you
> can try it one way, and if it doesn't work flip the DISPLAY values and
> try again. (alternatively you can check which rank is run on which
> machine like this: "mpiexec -l -n 2 hostname").
>
> Note that you can run any X program this way. I generally use ddd as
> the debugger instead of "xterm -e gdb"
>
> I hope this clarified more thatn it confused.
>
> -d
>
> On 01/03/2008 04:03 AM, Krishna Chaitanya wrote:
> > Hi,
> > Thanks for the help, guess I complicated my question
> un-necessarily.
> > I wish to run a program on two machines and have two debug
> > windows on my local machine, so that i can trace through the pt2pt code.
> > This must concern xterm and setting the display variable correctly. At
> > this stage, I have the DISPLAY set to 0.0 on both the machines and I am
> > ssh-ing into the remote machine by using the -X switch. The debug window
> > for the remote machine is getting launched at the remote terminal but is
> > not getting displayed on mine. Please let me know what needs to be done
> > to have the window displayed on my machine.
> >
> > Thanks,
> > Krishna Chaitanya K
> >
> >
> >
> >
> > On 1/2/08, *Darius Buntinas* <buntinas at mcs.anl.gov
> > <mailto:buntinas at mcs.anl.gov>> wrote:
> >
> >
> > I'm not sure exactly what you want to do, so here are a few ideas.
> >
> > If you want to start rank 0 before rank 1, you can change your test
> > program so it calls mpi_commrank right after mpi_init. Then set a
> > breakpoint just after mpi_commrank, and when you hit the breakpoint,
> you
> > can look at the rank and decide which one to continue running.
> >
> > If you can't modify the test program, you can set the breakpoint
> just
> > after mpi_init, then read MPIDI_Process.my_pg_rank. You'll have to
> > configure MPICH2 with --enable-g=dbg to get debugging symbols added
> (you
> > might also want to configure with --disable-compiler-optimizations
> to
> > remove the -O2 flag and make it easier to step through the code).
> >
> > If you really need to know the rank of a process before mpi_init is
> > called, you can read the PMI_RANK environment variable. Note that
> > mpi_init performs an implicit barrier, so if one process calls
> > mpi_init,
> > it won't exit until all other processes have called mpi_init, so
> even if
> > you start one process before the others, it won't get past mpi_init.
> >
> > Hope that helps,
> > -d
> >
> > On 12/31/2007 04:07 AM, Krishna Chaitanya wrote:
> > > Hi,
> > > I have been tracing the flow of the mpich code by
> > > executing a simple program having MPI_Send() and MPI_Recv(),on
> one
> > > machine. I have been using gdb along with xtern to have two
> windows
> > > open at the same time as I step through the code. I wish to get a
> > better
> > > glimpse of the working of the point to point calls, by launching
> > the job
> > > on two machines and by tracing the flow in a similar manner.
> > Could you
> > > please tell me how I can go about it? I have a feeling that this
> > can be
> > > done by hard-coding the macros and telling the daemons directly
> > that one
> > > machine is rank1 and the other is rank2. That way, i can start
> rank1
> > > process first and the rank2 process a little later and trace
> > through the
> > > code.
> > >
> > > Thanks,
> > > Krishna Chaitanya,
> > > Dept of Information Technology,
> > > National Institute of Technology, Karnataka ( NITK )
> > > India
> > >
> > > --
> > > In the middle of difficulty, lies opportunity
> >
> >
> >
> >
> > --
> > In the middle of difficulty, lies opportunity
>
--
In the middle of difficulty, lies opportunity
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20080103/715966b4/attachment.htm>
More information about the mpich-discuss
mailing list