[mpich-discuss] Trouble in getting the logging interface to work

Anthony Chan chan at mcs.anl.gov
Mon Mar 24 09:22:01 CDT 2008



On Mon, 24 Mar 2008, Krishna Chaitanya wrote:

>   Sorry for re-posting.
>> I took a look at the documentation at src/util/multichannel/mpi.c
>           Guess this is only for windows.
>           It would be of great help if someone could point me to the
> function that takes care of mapping MPI_Init to its wrapper, defined in
> src/mpe2/src/wrappers/src/log_mpi_core.c, when the library is compiled with
> the --enable-mpe switch., instead of the function defined in
> src/mpi/init/init.c

The function in src/mpi/init/init.c is defined when linked with -lmpich,
i.e. mpicc. The functions in log_mpi_core.c is defined when linked with
"mpicc -mpe=mpilog".  Try do "mpicc .... -show" will show you what 
libraries and their link order.

A.Chan

>
> Krishna Chaitanya K
>
> On Sun, Mar 23, 2008 at 2:29 PM, Krishna Chaitanya <kris.c1986 at gmail.com>
> wrote:
>
>>> See section "CUSTOMIZING LOGFILES" in mpich2-xxx/src/mpe2/README.
>> Correct me if I am wrong :
>> Since I am dealing with PERUSE events, whenever such an event occurs, a
>> PERUSE function, defined in <mpich-dir>/src/peruse/peruse.c, is invoked by
>> the MPI library. I am trying to get this event displayed in the jumpshot
>> output. For this to be done, I need to define a wrapper function which gets
>> invoked when a PERUSE event occurs, to log the event and then to call the
>> actuall peruse function, which is similar to the way the wrapper function at
>> log_mpi_core.c is called, when MPI_Init is called.
>>
>> Could you please clarify on the dynamic mapping?
>> I took a look at the documentation at src/util/multichannel/mpi.c. I
>> think, I understood what is going on in LoadFunctions() and the way the
>> function pointers are assigned addresses depending the dll that is being
>> used.
>>
>> Krishna Chaitanya K
>>
>>
>> On Sun, Mar 23, 2008 at 12:59 PM, Anthony Chan <chan at mcs.anl.gov> wrote:
>>
>>>
>>> See section "CUSTOMIZING LOGFILES" in mpich2-xxx/src/mpe2/README.
>>> You don't need to modify MPE libraries.
>>>
>>> A.Chan
>>>
>>> On Sun, 23 Mar 2008, Krishna Chaitanya wrote:
>>>
>>>> I have modified the mpe library to log the events that I am interested
>>> in
>>>> monitoring. But, I am bit hazy about how a function like MPI_Init is
>>>> actually linked to the MPI_Init routine in the file log_mpi_core.c
>>> when we
>>>> compile the MPI application with the -mpe=mpilog switch. Could someone
>>> point
>>>> me to the routine that takes care of such a mapping?
>>>>
>>>> Thanks,
>>>> Krishna Chaitanya K
>>>>
>>>> On Sat, Mar 22, 2008 at 3:01 AM, Krishna Chaitanya <
>>> kris.c1986 at gmail.com>
>>>> wrote:
>>>>
>>>>> Thanks a lot. I installed the latest jdk version and I am now able to
>>> look
>>>>> at the jumpshot output.
>>>>>
>>>>> Krishna Chaitanya K
>>>>>
>>>>>
>>>>> On Sat, Mar 22, 2008 at 1:45 AM, Anthony Chan <chan at mcs.anl.gov>
>>> wrote:
>>>>>
>>>>>>
>>>>>> The error that you showed earlier does not suggest the problem is
>>> with
>>>>>> running jumpshot on your machine with limited memory.  If your clog2
>>>>>> file
>>>>>> isn't too bad, send it to me.
>>>>>>
>>>>>> On Fri, 21 Mar 2008, Krishna Chaitanya wrote:
>>>>>>
>>>>>>> I resolved that issue.
>>>>>>> My comp ( Intel centrino 32 bit , 256 MB RAM - Dated, I agree)
>>> hangs
>>>>>> each
>>>>>>> time I launch jumpshot with the slogfile. Since this is an
>>> independent
>>>>>>> project, I am constrained when it comes to the availability of
>>>>>> machines.
>>>>>>> Would you recommend that I give it a try on a 64bit AMD, 512MB RAM?
>>> (
>>>>>> Will
>>>>>>> have to start from installing linux on this machine. Is it worth
>>> the
>>>>>> effort
>>>>>>> ?) If it requires higher configuration, would you please suggest a
>>>>>> lighter
>>>>>>> graphical tool that I can use to present the  occurrence of events
>>> and
>>>>>> the
>>>>>>> corresponding times?
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Krishna Chaitanya K
>>>>>>>
>>>>>>> On Fri, Mar 21, 2008 at 8:23 PM, Anthony Chan <chan at mcs.anl.gov>
>>>>>> wrote:
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Fri, 21 Mar 2008, Krishna Chaitanya wrote:
>>>>>>>>
>>>>>>>>>
>>>>>>>>> The file block pointer to the Tree Directory is NOT initialized!,
>>>>>> can't
>>>>>>>> read
>>>>>>>>> it.
>>>>>>>>>
>>>>>>>>
>>>>>>>> That means the slog2 file isn't generated completely.  Something
>>> went
>>>>>>>> wrong in the convertion process (assuming your clog2 file is
>>>>>> complete).
>>>>>>>> If your MPI program doesn't finish MPI_Finalize normally, your
>>> clog2
>>>>>>>> file will be incomplete.
>>>>>>>>
>>>>>>>>>
>>>>>>>>>         IS there any environment variable that needs to be
>>>>>> initialsed?
>>>>>>>>
>>>>>>>> Nothing needs to be initialized by hand.
>>>>>>>>
>>>>>>>> A.Chan
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Krishna Chaitanya K
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Thu, Mar 20, 2008 at 4:56 PM, Dave Goodell <
>>> goodell at mcs.anl.gov>
>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> It's pretty hard to debug this issue via email.  However, you
>>> could
>>>>>>>>>> try running valgrind on your modified MPICH2 to see if any
>>> obvious
>>>>>>>>>> bugs pop out.  When you do, make sure that you configure with
>>> "--
>>>>>>>>>> enable-g=dbg,meminit" in order to avoid spurious warnings and to
>>> be
>>>>>>>>>> able to see stack traces.
>>>>>>>>>>
>>>>>>>>>> -Dave
>>>>>>>>>>
>>>>>>>>>> On Mar 19, 2008, at 1:05 PM, Krishna Chaitanya wrote:
>>>>>>>>>>
>>>>>>>>>>> The problem seems to be with the communicator in MPI_Bcast()
>>>>>> (/src/
>>>>>>>>>>> mpi/coll/bcast.c).
>>>>>>>>>>> The comm_ptr is initialized to NULL and after a call to
>>>>>>>>>>> MPID_Comm_get_ptr( comm, comm_ptr ); , the comm_ptr points to
>>> the
>>>>>>>>>>> communicator object which was created throught MPI_Init().
>>>>>>>>>>> However,  MPID_Comm_valid_ptr( comm_ptr, mpi_errno ) returns
>>> with
>>>>>> a
>>>>>>>>>>> value other than MPI_SUCCESS.
>>>>>>>>>>> During some traces, it used to crash at this point itself. On
>>> some
>>>>>>>>>>> other traces, it used to go into the progress engine as I
>>>>>> described
>>>>>>>>>>> in my previous mails.
>>>>>>>>>>>
>>>>>>>>>>> What could be the reason? Hope someone chips in. I havent been
>>>>>> able
>>>>>>>>>>> to figure this out for sometime now.
>>>>>>>>>>>
>>>>>>>>>>> Krishna Chaitanya K
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Mar 19, 2008 at 8:44 AM, Krishna Chaitanya
>>>>>>>>>>> <kris.c1986 at gmail.com> wrote:
>>>>>>>>>>> This might help :
>>>>>>>>>>>
>>>>>>>>>>> In the MPID_Comm structure, I have included the following line
>>> for
>>>>>>>>>>> the peruse place-holder :
>>>>>>>>>>>  struct mpich_peruse_handle_t** c_peruse_handles;
>>>>>>>>>>>
>>>>>>>>>>> And in the function, MPID_Init_thread(), i have the line
>>>>>>>>>>>  MPIR_Process.comm_world->c_peruse_handles = NULL;
>>>>>>>>>>>  when the rest of the members of the comm_world structure are
>>>>>> being
>>>>>>>>>>> populated.
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Krishna Chaitanya K
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Mar 19, 2008 at 8:19 AM, Krishna Chaitanya
>>>>>>>>>>> <kris.c1986 at gmail.com> wrote:
>>>>>>>>>>> Thanks for the help. I am facing an weird problem right now. To
>>>>>>>>>>> incorporate the PERUSE component, I have modified the
>>> communicator
>>>>>>>>>>> data structure to incude the PERUSE handles. The program
>>> executes
>>>>>>>>>>> as expected when compiled without the "mpe=mpilog" flag.When I
>>>>>>>>>>> compile it with the mpe component, the program gives this
>>> output :
>>>>>>>>>>>
>>>>>>>>>>> Fatal error in MPI_Bcast: Invalid communicator, error stack:
>>>>>>>>>>> MPI_Bcast(784): MPI_Bcast(buf=0x9260f98, count=1, MPI_INT,
>>> root=0,
>>>>>>>>>>> MPI_COMM_WORLD) failed
>>>>>>>>>>> MPI_Bcast(717): Invalid communicator
>>>>>>>>>>>
>>>>>>>>>>> On tracing further, I understood this :
>>>>>>>>>>> MPI_Init () (  log_mpi_core.c )
>>>>>>>>>>>  -- >  PMPI_Init ( the communicator object is created here )
>>>>>>>>>>>  -- >  MPE_Init_log ()
>>>>>>>>>>>         -- > CLOG_Local_init()
>>>>>>>>>>>               -- > CLOG_Buffer_init4write ()
>>>>>>>>>>>                     -- > CLOG_Preamble_env_init()
>>>>>>>>>>>                           -- >   MPI_Bcast ()  (bcast.c)
>>>>>>>>>>>                                   -- > MPIR_Bcast ()
>>>>>>>>>>>                                          -- >  MPIC_Recv ()  /
>>>>>>>>>>> MPIC_Send()
>>>>>>>>>>>                                          -- >  MPIC_Wait()
>>>>>>>>>>>                                       < Program crashes >
>>>>>>>>>>>      The MPIC_Wait function is invoking the progress engine,
>>> which
>>>>>>>>>>> works properly without the mpe component.
>>>>>>>>>>>       Even within the progress engine, MPIDU_Sock_wait() and
>>>>>>>>>>> MPIDI_CH3I_Progress_handle_sock_event() are executed a couple
>>> of
>>>>>>>>>>> times before the program crashes in the
>>> MPIDU_Socki_handle_read()
>>>>>>>>>>> or the MPIDU_Socki_handle_write() functions. ( The read() and
>>> the
>>>>>>>>>>> write() functions work two times, I think)
>>>>>>>>>>>      I am finding it very hard to reason why the program
>>> crashes
>>>>>>>>>>> with mpe. Could you please suggest where I need to look at to
>>> sort
>>>>>>>>>>> this issue out?
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Krishna Chaitanya K
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Mar 19, 2008 at 2:20 AM, Anthony Chan <chan at mcs.anl.gov
>>>>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Wed, 19 Mar 2008, Krishna Chaitanya wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hi,
>>>>>>>>>>>>          I tried configuring MPICH2 by doing :
>>>>>>>>>>>> ./configure --prefix=/home/kc/mpich-install/ --enable-mpe
>>>>>>>>>>>> --with-logging=SLOG  CC=gcc CFLAGS=-g   && make && make
>>> install
>>>>>>>>>>>>          It  flashed an error messaage saying :
>>>>>>>>>>>> onfigure: error: ./src/util/logging/SLOG does not exist.
>>>>>>>>>>> Configure aborted
>>>>>>>>>>>
>>>>>>>>>>> The --with-logging is for MPICH2's internal logging, not MPE's
>>>>>>>>>>> logging.
>>>>>>>>>>> As what you did below is fine is fine.
>>>>>>>>>>>>
>>>>>>>>>>>>          After that, I tried :
>>>>>>>>>>>> ./configure --prefix=/home/kc/mpich-install/ --enable-mpe
>>> CC=gcc
>>>>>>>>>>> CFLAGS=-g
>>>>>>>>>>>> && make && make install
>>>>>>>>>>>>         The installation was normal, when I tried compiling an
>>>>>>>>>>> example
>>>>>>>>>>>> program by doing :
>>>>>>>>>>>> mpicc -mpilog -o sample  sample.c
>>>>>>>>>>>> cc1: error: unrecognized command line option "-mpilog"
>>>>>>>>>>>
>>>>>>>>>>> Do "mpicc -mpe=mpilog -o sample sample.c" instead.  For more
>>>>>> details,
>>>>>>>>>>> see "mpicc -mpe=help" and see mpich2/src/mpe2/README.
>>>>>>>>>>>
>>>>>>>>>>> A.Chan
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>          Can anyone please tell me what needs to be done to
>>> use
>>>>>>>>>>> the SLOG
>>>>>>>>>>>> logging format?
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Krishna Chaitanya K
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>> In the middle of difficulty, lies opportunity
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> In the middle of difficulty, lies opportunity
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> In the middle of difficulty, lies opportunity
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> In the middle of difficulty, lies opportunity
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> In the middle of difficulty, lies opportunity
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> In the middle of difficulty, lies opportunity
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> In the middle of difficulty, lies opportunity
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> In the middle of difficulty, lies opportunity
>>>>
>>>
>>>
>>
>>
>> --
>> In the middle of difficulty, lies opportunity
>>
>
>
>
> -- 
> In the middle of difficulty, lies opportunity
>




More information about the mpich-discuss mailing list