[mpich-discuss] Trouble in getting the logging interface to work

Krishna Chaitanya kris.c1986 at gmail.com
Tue Mar 25 13:42:53 CDT 2008


>The 2 MPI_Init definitions are defined in 2 different libraries as
you have found.
      True. But, when the MPI application is being compiled, we notice
the order "-lmpe -lmpich". But, how is it that a conflict is not
reported?

I am using PERUSE to profile the MPICH library. So, as of now, I am
having PERUSE_TRACE_COMM_EVENT defined in libmpich.a. However, that is
currently generating text output. I wish to graphically represent the
PERUSE events in conjunction with the other MPI events.

Looking at the way the MPE library has been built, I understood that I
need to have PERUSE_TRACE_COMM_EVENT defined in liblmpe.a. This way,
when I compile the PERUSE-MPICH application with the -mpe=mpilog
switch, the function is invoked in the mpe library is invoked, the
occurence  is logged, and then the actual PERUSE function defined in
libmpich.a is invoked.
Have I got that right?


Thanks,
Krishna Chaitanya


On 3/25/08, Anthony Chan <chan at mcs.anl.gov> wrote:
>
>
> On Mon, 24 Mar 2008, Krishna Chaitanya wrote:
>
> > That answers a lot of my questions.
> > Sticking to MPI_Init for this discussion, the function is defined at two
> > places.So,there will be a reference to MPI_Init in lmpich and also a
> > reference in lmpe. But, how is it that when an MPI application is being
> > compiled, mpicc doesnt complain that there were two references to
> MPI_Init?
> > What exactly has been done to solve this?
>
> log_mpi_core.c uses the PMPI interface defined in MPI standard.  The 2
> MPI_Init definitions are defined in 2 different libraries as you have
> found.
>
> > I am actually facing this problem with the PERUSE function that i have
> > defined at :
> > 1 > /src/peruse/peruse.c             and
> > 2 > src/mpe2/src/wrappers/src/log_mpi_core.c.
> >
> >    The first one goes into lmpich and the second one is a part of lmpe.
> > However, when I compile my MPI program, I get the following message :
> > /home/kc/mpich-install//lib/libmpich.a(peruse.o): In function
> > `PERUSE_TRACE_COMM_EVENT':
> > /home/kc/mpich-src/src/peruse/peruse.c:324: multiple definition of
> > `PERUSE_TRACE_COMM_EVENT'
> >
> /home/kc/mpich-install//lib/liblmpe.a(log_mpi_core.o):/home/kc/mpich-src/src/mpe2/src/wrappers/src/log_mpi_core.c:6830:
> > first defined here
> > collect2: ld returned 1 exit status
>
> Why did you define PERUSE_TRACE_COMM_EVENT twice ?  One in libmpich.a
> and one in liblmpe.a. Are you trying to use MPE to profile PERUSE or to
> use PERUSE to profile MPE ?
>
> A.Chan
>
> >
> >
> > Thanks for your time,
> > Krishna Chaitanya K
> >
> >
> > On Mon, Mar 24, 2008 at 10:22 AM, Anthony Chan <chan at mcs.anl.gov> wrote:
> >
> >>
> >>
> >> On Mon, 24 Mar 2008, Krishna Chaitanya wrote:
> >>
> >>>   Sorry for re-posting.
> >>>> I took a look at the documentation at src/util/multichannel/mpi.c
> >>>           Guess this is only for windows.
> >>>           It would be of great help if someone could point me to the
> >>> function that takes care of mapping MPI_Init to its wrapper, defined in
> >>> src/mpe2/src/wrappers/src/log_mpi_core.c, when the library is compiled
> >> with
> >>> the --enable-mpe switch., instead of the function defined in
> >>> src/mpi/init/init.c
> >>
> >> The function in src/mpi/init/init.c is defined when linked with -lmpich,
> >> i.e. mpicc. The functions in log_mpi_core.c is defined when linked with
> >> "mpicc -mpe=mpilog".  Try do "mpicc .... -show" will show you what
> >> libraries and their link order.
> >>
> >> A.Chan
> >>
> >>>
> >>> Krishna Chaitanya K
> >>>
> >>> On Sun, Mar 23, 2008 at 2:29 PM, Krishna Chaitanya <kris.c1986 at gmail.com
> >>>
> >>> wrote:
> >>>
> >>>>> See section "CUSTOMIZING LOGFILES" in mpich2-xxx/src/mpe2/README.
> >>>> Correct me if I am wrong :
> >>>> Since I am dealing with PERUSE events, whenever such an event occurs, a
> >>>> PERUSE function, defined in <mpich-dir>/src/peruse/peruse.c, is invoked
> >> by
> >>>> the MPI library. I am trying to get this event displayed in the
> >> jumpshot
> >>>> output. For this to be done, I need to define a wrapper function which
> >> gets
> >>>> invoked when a PERUSE event occurs, to log the event and then to call
> >> the
> >>>> actuall peruse function, which is similar to the way the wrapper
> >> function at
> >>>> log_mpi_core.c is called, when MPI_Init is called.
> >>>>
> >>>> Could you please clarify on the dynamic mapping?
> >>>> I took a look at the documentation at src/util/multichannel/mpi.c. I
> >>>> think, I understood what is going on in LoadFunctions() and the way the
> >>>> function pointers are assigned addresses depending the dll that is
> >> being
> >>>> used.
> >>>>
> >>>> Krishna Chaitanya K
> >>>>
> >>>>
> >>>> On Sun, Mar 23, 2008 at 12:59 PM, Anthony Chan <chan at mcs.anl.gov>
> >> wrote:
> >>>>
> >>>>>
> >>>>> See section "CUSTOMIZING LOGFILES" in mpich2-xxx/src/mpe2/README.
> >>>>> You don't need to modify MPE libraries.
> >>>>>
> >>>>> A.Chan
> >>>>>
> >>>>> On Sun, 23 Mar 2008, Krishna Chaitanya wrote:
> >>>>>
> >>>>>> I have modified the mpe library to log the events that I am
> >> interested
> >>>>> in
> >>>>>> monitoring. But, I am bit hazy about how a function like MPI_Init is
> >>>>>> actually linked to the MPI_Init routine in the file log_mpi_core.c
> >>>>> when we
> >>>>>> compile the MPI application with the -mpe=mpilog switch. Could
> >> someone
> >>>>> point
> >>>>>> me to the routine that takes care of such a mapping?
> >>>>>>
> >>>>>> Thanks,
> >>>>>> Krishna Chaitanya K
> >>>>>>
> >>>>>> On Sat, Mar 22, 2008 at 3:01 AM, Krishna Chaitanya <
> >>>>> kris.c1986 at gmail.com>
> >>>>>> wrote:
> >>>>>>
> >>>>>>> Thanks a lot. I installed the latest jdk version and I am now able
> >> to
> >>>>> look
> >>>>>>> at the jumpshot output.
> >>>>>>>
> >>>>>>> Krishna Chaitanya K
> >>>>>>>
> >>>>>>>
> >>>>>>> On Sat, Mar 22, 2008 at 1:45 AM, Anthony Chan <chan at mcs.anl.gov>
> >>>>> wrote:
> >>>>>>>
> >>>>>>>>
> >>>>>>>> The error that you showed earlier does not suggest the problem is
> >>>>> with
> >>>>>>>> running jumpshot on your machine with limited memory.  If your
> >> clog2
> >>>>>>>> file
> >>>>>>>> isn't too bad, send it to me.
> >>>>>>>>
> >>>>>>>> On Fri, 21 Mar 2008, Krishna Chaitanya wrote:
> >>>>>>>>
> >>>>>>>>> I resolved that issue.
> >>>>>>>>> My comp ( Intel centrino 32 bit , 256 MB RAM - Dated, I agree)
> >>>>> hangs
> >>>>>>>> each
> >>>>>>>>> time I launch jumpshot with the slogfile. Since this is an
> >>>>> independent
> >>>>>>>>> project, I am constrained when it comes to the availability of
> >>>>>>>> machines.
> >>>>>>>>> Would you recommend that I give it a try on a 64bit AMD, 512MB
> >> RAM?
> >>>>> (
> >>>>>>>> Will
> >>>>>>>>> have to start from installing linux on this machine. Is it worth
> >>>>> the
> >>>>>>>> effort
> >>>>>>>>> ?) If it requires higher configuration, would you please suggest a
> >>>>>>>> lighter
> >>>>>>>>> graphical tool that I can use to present the  occurrence of events
> >>>>> and
> >>>>>>>> the
> >>>>>>>>> corresponding times?
> >>>>>>>>>
> >>>>>>>>> Thanks,
> >>>>>>>>> Krishna Chaitanya K
> >>>>>>>>>
> >>>>>>>>> On Fri, Mar 21, 2008 at 8:23 PM, Anthony Chan <chan at mcs.anl.gov>
> >>>>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> On Fri, 21 Mar 2008, Krishna Chaitanya wrote:
> >>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> The file block pointer to the Tree Directory is NOT
> >> initialized!,
> >>>>>>>> can't
> >>>>>>>>>> read
> >>>>>>>>>>> it.
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> That means the slog2 file isn't generated completely.  Something
> >>>>> went
> >>>>>>>>>> wrong in the convertion process (assuming your clog2 file is
> >>>>>>>> complete).
> >>>>>>>>>> If your MPI program doesn't finish MPI_Finalize normally, your
> >>>>> clog2
> >>>>>>>>>> file will be incomplete.
> >>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>         IS there any environment variable that needs to be
> >>>>>>>> initialsed?
> >>>>>>>>>>
> >>>>>>>>>> Nothing needs to be initialized by hand.
> >>>>>>>>>>
> >>>>>>>>>> A.Chan
> >>>>>>>>>>>
> >>>>>>>>>>> Thanks,
> >>>>>>>>>>> Krishna Chaitanya K
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> On Thu, Mar 20, 2008 at 4:56 PM, Dave Goodell <
> >>>>> goodell at mcs.anl.gov>
> >>>>>>>>>> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>>> It's pretty hard to debug this issue via email.  However, you
> >>>>> could
> >>>>>>>>>>>> try running valgrind on your modified MPICH2 to see if any
> >>>>> obvious
> >>>>>>>>>>>> bugs pop out.  When you do, make sure that you configure with
> >>>>> "--
> >>>>>>>>>>>> enable-g=dbg,meminit" in order to avoid spurious warnings and
> >> to
> >>>>> be
> >>>>>>>>>>>> able to see stack traces.
> >>>>>>>>>>>>
> >>>>>>>>>>>> -Dave
> >>>>>>>>>>>>
> >>>>>>>>>>>> On Mar 19, 2008, at 1:05 PM, Krishna Chaitanya wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>>> The problem seems to be with the communicator in MPI_Bcast()
> >>>>>>>> (/src/
> >>>>>>>>>>>>> mpi/coll/bcast.c).
> >>>>>>>>>>>>> The comm_ptr is initialized to NULL and after a call to
> >>>>>>>>>>>>> MPID_Comm_get_ptr( comm, comm_ptr ); , the comm_ptr points to
> >>>>> the
> >>>>>>>>>>>>> communicator object which was created throught MPI_Init().
> >>>>>>>>>>>>> However,  MPID_Comm_valid_ptr( comm_ptr, mpi_errno ) returns
> >>>>> with
> >>>>>>>> a
> >>>>>>>>>>>>> value other than MPI_SUCCESS.
> >>>>>>>>>>>>> During some traces, it used to crash at this point itself. On
> >>>>> some
> >>>>>>>>>>>>> other traces, it used to go into the progress engine as I
> >>>>>>>> described
> >>>>>>>>>>>>> in my previous mails.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> What could be the reason? Hope someone chips in. I havent been
> >>>>>>>> able
> >>>>>>>>>>>>> to figure this out for sometime now.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Krishna Chaitanya K
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> On Wed, Mar 19, 2008 at 8:44 AM, Krishna Chaitanya
> >>>>>>>>>>>>> <kris.c1986 at gmail.com> wrote:
> >>>>>>>>>>>>> This might help :
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> In the MPID_Comm structure, I have included the following line
> >>>>> for
> >>>>>>>>>>>>> the peruse place-holder :
> >>>>>>>>>>>>>  struct mpich_peruse_handle_t** c_peruse_handles;
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> And in the function, MPID_Init_thread(), i have the line
> >>>>>>>>>>>>>  MPIR_Process.comm_world->c_peruse_handles = NULL;
> >>>>>>>>>>>>>  when the rest of the members of the comm_world structure are
> >>>>>>>> being
> >>>>>>>>>>>>> populated.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Thanks,
> >>>>>>>>>>>>> Krishna Chaitanya K
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> On Wed, Mar 19, 2008 at 8:19 AM, Krishna Chaitanya
> >>>>>>>>>>>>> <kris.c1986 at gmail.com> wrote:
> >>>>>>>>>>>>> Thanks for the help. I am facing an weird problem right now.
> >> To
> >>>>>>>>>>>>> incorporate the PERUSE component, I have modified the
> >>>>> communicator
> >>>>>>>>>>>>> data structure to incude the PERUSE handles. The program
> >>>>> executes
> >>>>>>>>>>>>> as expected when compiled without the "mpe=mpilog" flag.When I
> >>>>>>>>>>>>> compile it with the mpe component, the program gives this
> >>>>> output :
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Fatal error in MPI_Bcast: Invalid communicator, error stack:
> >>>>>>>>>>>>> MPI_Bcast(784): MPI_Bcast(buf=0x9260f98, count=1, MPI_INT,
> >>>>> root=0,
> >>>>>>>>>>>>> MPI_COMM_WORLD) failed
> >>>>>>>>>>>>> MPI_Bcast(717): Invalid communicator
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> On tracing further, I understood this :
> >>>>>>>>>>>>> MPI_Init () (  log_mpi_core.c )
> >>>>>>>>>>>>>  -- >  PMPI_Init ( the communicator object is created here )
> >>>>>>>>>>>>>  -- >  MPE_Init_log ()
> >>>>>>>>>>>>>         -- > CLOG_Local_init()
> >>>>>>>>>>>>>               -- > CLOG_Buffer_init4write ()
> >>>>>>>>>>>>>                     -- > CLOG_Preamble_env_init()
> >>>>>>>>>>>>>                           -- >   MPI_Bcast ()  (bcast.c)
> >>>>>>>>>>>>>                                   -- > MPIR_Bcast ()
> >>>>>>>>>>>>>                                          -- >  MPIC_Recv ()  /
> >>>>>>>>>>>>> MPIC_Send()
> >>>>>>>>>>>>>                                          -- >  MPIC_Wait()
> >>>>>>>>>>>>>                                       < Program crashes >
> >>>>>>>>>>>>>      The MPIC_Wait function is invoking the progress engine,
> >>>>> which
> >>>>>>>>>>>>> works properly without the mpe component.
> >>>>>>>>>>>>>       Even within the progress engine, MPIDU_Sock_wait() and
> >>>>>>>>>>>>> MPIDI_CH3I_Progress_handle_sock_event() are executed a couple
> >>>>> of
> >>>>>>>>>>>>> times before the program crashes in the
> >>>>> MPIDU_Socki_handle_read()
> >>>>>>>>>>>>> or the MPIDU_Socki_handle_write() functions. ( The read() and
> >>>>> the
> >>>>>>>>>>>>> write() functions work two times, I think)
> >>>>>>>>>>>>>      I am finding it very hard to reason why the program
> >>>>> crashes
> >>>>>>>>>>>>> with mpe. Could you please suggest where I need to look at to
> >>>>> sort
> >>>>>>>>>>>>> this issue out?
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Thanks,
> >>>>>>>>>>>>> Krishna Chaitanya K
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> On Wed, Mar 19, 2008 at 2:20 AM, Anthony Chan <
> >> chan at mcs.anl.gov
> >>>>>>
> >>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> On Wed, 19 Mar 2008, Krishna Chaitanya wrote:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> Hi,
> >>>>>>>>>>>>>>          I tried configuring MPICH2 by doing :
> >>>>>>>>>>>>>> ./configure --prefix=/home/kc/mpich-install/ --enable-mpe
> >>>>>>>>>>>>>> --with-logging=SLOG  CC=gcc CFLAGS=-g   && make && make
> >>>>> install
> >>>>>>>>>>>>>>          It  flashed an error messaage saying :
> >>>>>>>>>>>>>> onfigure: error: ./src/util/logging/SLOG does not exist.
> >>>>>>>>>>>>> Configure aborted
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> The --with-logging is for MPICH2's internal logging, not MPE's
> >>>>>>>>>>>>> logging.
> >>>>>>>>>>>>> As what you did below is fine is fine.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>          After that, I tried :
> >>>>>>>>>>>>>> ./configure --prefix=/home/kc/mpich-install/ --enable-mpe
> >>>>> CC=gcc
> >>>>>>>>>>>>> CFLAGS=-g
> >>>>>>>>>>>>>> && make && make install
> >>>>>>>>>>>>>>         The installation was normal, when I tried compiling
> >> an
> >>>>>>>>>>>>> example
> >>>>>>>>>>>>>> program by doing :
> >>>>>>>>>>>>>> mpicc -mpilog -o sample  sample.c
> >>>>>>>>>>>>>> cc1: error: unrecognized command line option "-mpilog"
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Do "mpicc -mpe=mpilog -o sample sample.c" instead.  For more
> >>>>>>>> details,
> >>>>>>>>>>>>> see "mpicc -mpe=help" and see mpich2/src/mpe2/README.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> A.Chan
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>          Can anyone please tell me what needs to be done to
> >>>>> use
> >>>>>>>>>>>>> the SLOG
> >>>>>>>>>>>>>> logging format?
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Thanks,
> >>>>>>>>>>>>>> Krishna Chaitanya K
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> --
> >>>>>>>>>>>>>> In the middle of difficulty, lies opportunity
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> --
> >>>>>>>>>>>>> In the middle of difficulty, lies opportunity
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> --
> >>>>>>>>>>>>> In the middle of difficulty, lies opportunity
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> --
> >>>>>>>>>>>>> In the middle of difficulty, lies opportunity
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> --
> >>>>>>>>>>> In the middle of difficulty, lies opportunity
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> --
> >>>>>>>>> In the middle of difficulty, lies opportunity
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> --
> >>>>>>> In the middle of difficulty, lies opportunity
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> --
> >>>>>> In the middle of difficulty, lies opportunity
> >>>>>>
> >>>>>
> >>>>>
> >>>>
> >>>>
> >>>> --
> >>>> In the middle of difficulty, lies opportunity
> >>>>
> >>>
> >>>
> >>>
> >>> --
> >>> In the middle of difficulty, lies opportunity
> >>>
> >>
> >>
> >
> >
> > --
> > In the middle of difficulty, lies opportunity
> >
>
>


-- 
In the middle of difficulty, lies opportunity




More information about the mpich-discuss mailing list