[mpich-discuss] Trouble in getting the logging interface to work
Anthony Chan
chan at mcs.anl.gov
Tue Mar 25 12:59:55 CDT 2008
On Mon, 24 Mar 2008, Krishna Chaitanya wrote:
> That answers a lot of my questions.
> Sticking to MPI_Init for this discussion, the function is defined at two
> places.So,there will be a reference to MPI_Init in lmpich and also a
> reference in lmpe. But, how is it that when an MPI application is being
> compiled, mpicc doesnt complain that there were two references to MPI_Init?
> What exactly has been done to solve this?
log_mpi_core.c uses the PMPI interface defined in MPI standard. The 2
MPI_Init definitions are defined in 2 different libraries as you have
found.
> I am actually facing this problem with the PERUSE function that i have
> defined at :
> 1 > /src/peruse/peruse.c and
> 2 > src/mpe2/src/wrappers/src/log_mpi_core.c.
>
> The first one goes into lmpich and the second one is a part of lmpe.
> However, when I compile my MPI program, I get the following message :
> /home/kc/mpich-install//lib/libmpich.a(peruse.o): In function
> `PERUSE_TRACE_COMM_EVENT':
> /home/kc/mpich-src/src/peruse/peruse.c:324: multiple definition of
> `PERUSE_TRACE_COMM_EVENT'
> /home/kc/mpich-install//lib/liblmpe.a(log_mpi_core.o):/home/kc/mpich-src/src/mpe2/src/wrappers/src/log_mpi_core.c:6830:
> first defined here
> collect2: ld returned 1 exit status
Why did you define PERUSE_TRACE_COMM_EVENT twice ? One in libmpich.a
and one in liblmpe.a. Are you trying to use MPE to profile PERUSE or to
use PERUSE to profile MPE ?
A.Chan
>
>
> Thanks for your time,
> Krishna Chaitanya K
>
>
> On Mon, Mar 24, 2008 at 10:22 AM, Anthony Chan <chan at mcs.anl.gov> wrote:
>
>>
>>
>> On Mon, 24 Mar 2008, Krishna Chaitanya wrote:
>>
>>> Sorry for re-posting.
>>>> I took a look at the documentation at src/util/multichannel/mpi.c
>>> Guess this is only for windows.
>>> It would be of great help if someone could point me to the
>>> function that takes care of mapping MPI_Init to its wrapper, defined in
>>> src/mpe2/src/wrappers/src/log_mpi_core.c, when the library is compiled
>> with
>>> the --enable-mpe switch., instead of the function defined in
>>> src/mpi/init/init.c
>>
>> The function in src/mpi/init/init.c is defined when linked with -lmpich,
>> i.e. mpicc. The functions in log_mpi_core.c is defined when linked with
>> "mpicc -mpe=mpilog". Try do "mpicc .... -show" will show you what
>> libraries and their link order.
>>
>> A.Chan
>>
>>>
>>> Krishna Chaitanya K
>>>
>>> On Sun, Mar 23, 2008 at 2:29 PM, Krishna Chaitanya <kris.c1986 at gmail.com
>>>
>>> wrote:
>>>
>>>>> See section "CUSTOMIZING LOGFILES" in mpich2-xxx/src/mpe2/README.
>>>> Correct me if I am wrong :
>>>> Since I am dealing with PERUSE events, whenever such an event occurs, a
>>>> PERUSE function, defined in <mpich-dir>/src/peruse/peruse.c, is invoked
>> by
>>>> the MPI library. I am trying to get this event displayed in the
>> jumpshot
>>>> output. For this to be done, I need to define a wrapper function which
>> gets
>>>> invoked when a PERUSE event occurs, to log the event and then to call
>> the
>>>> actuall peruse function, which is similar to the way the wrapper
>> function at
>>>> log_mpi_core.c is called, when MPI_Init is called.
>>>>
>>>> Could you please clarify on the dynamic mapping?
>>>> I took a look at the documentation at src/util/multichannel/mpi.c. I
>>>> think, I understood what is going on in LoadFunctions() and the way the
>>>> function pointers are assigned addresses depending the dll that is
>> being
>>>> used.
>>>>
>>>> Krishna Chaitanya K
>>>>
>>>>
>>>> On Sun, Mar 23, 2008 at 12:59 PM, Anthony Chan <chan at mcs.anl.gov>
>> wrote:
>>>>
>>>>>
>>>>> See section "CUSTOMIZING LOGFILES" in mpich2-xxx/src/mpe2/README.
>>>>> You don't need to modify MPE libraries.
>>>>>
>>>>> A.Chan
>>>>>
>>>>> On Sun, 23 Mar 2008, Krishna Chaitanya wrote:
>>>>>
>>>>>> I have modified the mpe library to log the events that I am
>> interested
>>>>> in
>>>>>> monitoring. But, I am bit hazy about how a function like MPI_Init is
>>>>>> actually linked to the MPI_Init routine in the file log_mpi_core.c
>>>>> when we
>>>>>> compile the MPI application with the -mpe=mpilog switch. Could
>> someone
>>>>> point
>>>>>> me to the routine that takes care of such a mapping?
>>>>>>
>>>>>> Thanks,
>>>>>> Krishna Chaitanya K
>>>>>>
>>>>>> On Sat, Mar 22, 2008 at 3:01 AM, Krishna Chaitanya <
>>>>> kris.c1986 at gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Thanks a lot. I installed the latest jdk version and I am now able
>> to
>>>>> look
>>>>>>> at the jumpshot output.
>>>>>>>
>>>>>>> Krishna Chaitanya K
>>>>>>>
>>>>>>>
>>>>>>> On Sat, Mar 22, 2008 at 1:45 AM, Anthony Chan <chan at mcs.anl.gov>
>>>>> wrote:
>>>>>>>
>>>>>>>>
>>>>>>>> The error that you showed earlier does not suggest the problem is
>>>>> with
>>>>>>>> running jumpshot on your machine with limited memory. If your
>> clog2
>>>>>>>> file
>>>>>>>> isn't too bad, send it to me.
>>>>>>>>
>>>>>>>> On Fri, 21 Mar 2008, Krishna Chaitanya wrote:
>>>>>>>>
>>>>>>>>> I resolved that issue.
>>>>>>>>> My comp ( Intel centrino 32 bit , 256 MB RAM - Dated, I agree)
>>>>> hangs
>>>>>>>> each
>>>>>>>>> time I launch jumpshot with the slogfile. Since this is an
>>>>> independent
>>>>>>>>> project, I am constrained when it comes to the availability of
>>>>>>>> machines.
>>>>>>>>> Would you recommend that I give it a try on a 64bit AMD, 512MB
>> RAM?
>>>>> (
>>>>>>>> Will
>>>>>>>>> have to start from installing linux on this machine. Is it worth
>>>>> the
>>>>>>>> effort
>>>>>>>>> ?) If it requires higher configuration, would you please suggest a
>>>>>>>> lighter
>>>>>>>>> graphical tool that I can use to present the occurrence of events
>>>>> and
>>>>>>>> the
>>>>>>>>> corresponding times?
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Krishna Chaitanya K
>>>>>>>>>
>>>>>>>>> On Fri, Mar 21, 2008 at 8:23 PM, Anthony Chan <chan at mcs.anl.gov>
>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Fri, 21 Mar 2008, Krishna Chaitanya wrote:
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> The file block pointer to the Tree Directory is NOT
>> initialized!,
>>>>>>>> can't
>>>>>>>>>> read
>>>>>>>>>>> it.
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> That means the slog2 file isn't generated completely. Something
>>>>> went
>>>>>>>>>> wrong in the convertion process (assuming your clog2 file is
>>>>>>>> complete).
>>>>>>>>>> If your MPI program doesn't finish MPI_Finalize normally, your
>>>>> clog2
>>>>>>>>>> file will be incomplete.
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> IS there any environment variable that needs to be
>>>>>>>> initialsed?
>>>>>>>>>>
>>>>>>>>>> Nothing needs to be initialized by hand.
>>>>>>>>>>
>>>>>>>>>> A.Chan
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Krishna Chaitanya K
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Thu, Mar 20, 2008 at 4:56 PM, Dave Goodell <
>>>>> goodell at mcs.anl.gov>
>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> It's pretty hard to debug this issue via email. However, you
>>>>> could
>>>>>>>>>>>> try running valgrind on your modified MPICH2 to see if any
>>>>> obvious
>>>>>>>>>>>> bugs pop out. When you do, make sure that you configure with
>>>>> "--
>>>>>>>>>>>> enable-g=dbg,meminit" in order to avoid spurious warnings and
>> to
>>>>> be
>>>>>>>>>>>> able to see stack traces.
>>>>>>>>>>>>
>>>>>>>>>>>> -Dave
>>>>>>>>>>>>
>>>>>>>>>>>> On Mar 19, 2008, at 1:05 PM, Krishna Chaitanya wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> The problem seems to be with the communicator in MPI_Bcast()
>>>>>>>> (/src/
>>>>>>>>>>>>> mpi/coll/bcast.c).
>>>>>>>>>>>>> The comm_ptr is initialized to NULL and after a call to
>>>>>>>>>>>>> MPID_Comm_get_ptr( comm, comm_ptr ); , the comm_ptr points to
>>>>> the
>>>>>>>>>>>>> communicator object which was created throught MPI_Init().
>>>>>>>>>>>>> However, MPID_Comm_valid_ptr( comm_ptr, mpi_errno ) returns
>>>>> with
>>>>>>>> a
>>>>>>>>>>>>> value other than MPI_SUCCESS.
>>>>>>>>>>>>> During some traces, it used to crash at this point itself. On
>>>>> some
>>>>>>>>>>>>> other traces, it used to go into the progress engine as I
>>>>>>>> described
>>>>>>>>>>>>> in my previous mails.
>>>>>>>>>>>>>
>>>>>>>>>>>>> What could be the reason? Hope someone chips in. I havent been
>>>>>>>> able
>>>>>>>>>>>>> to figure this out for sometime now.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Krishna Chaitanya K
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Wed, Mar 19, 2008 at 8:44 AM, Krishna Chaitanya
>>>>>>>>>>>>> <kris.c1986 at gmail.com> wrote:
>>>>>>>>>>>>> This might help :
>>>>>>>>>>>>>
>>>>>>>>>>>>> In the MPID_Comm structure, I have included the following line
>>>>> for
>>>>>>>>>>>>> the peruse place-holder :
>>>>>>>>>>>>> struct mpich_peruse_handle_t** c_peruse_handles;
>>>>>>>>>>>>>
>>>>>>>>>>>>> And in the function, MPID_Init_thread(), i have the line
>>>>>>>>>>>>> MPIR_Process.comm_world->c_peruse_handles = NULL;
>>>>>>>>>>>>> when the rest of the members of the comm_world structure are
>>>>>>>> being
>>>>>>>>>>>>> populated.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> Krishna Chaitanya K
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Wed, Mar 19, 2008 at 8:19 AM, Krishna Chaitanya
>>>>>>>>>>>>> <kris.c1986 at gmail.com> wrote:
>>>>>>>>>>>>> Thanks for the help. I am facing an weird problem right now.
>> To
>>>>>>>>>>>>> incorporate the PERUSE component, I have modified the
>>>>> communicator
>>>>>>>>>>>>> data structure to incude the PERUSE handles. The program
>>>>> executes
>>>>>>>>>>>>> as expected when compiled without the "mpe=mpilog" flag.When I
>>>>>>>>>>>>> compile it with the mpe component, the program gives this
>>>>> output :
>>>>>>>>>>>>>
>>>>>>>>>>>>> Fatal error in MPI_Bcast: Invalid communicator, error stack:
>>>>>>>>>>>>> MPI_Bcast(784): MPI_Bcast(buf=0x9260f98, count=1, MPI_INT,
>>>>> root=0,
>>>>>>>>>>>>> MPI_COMM_WORLD) failed
>>>>>>>>>>>>> MPI_Bcast(717): Invalid communicator
>>>>>>>>>>>>>
>>>>>>>>>>>>> On tracing further, I understood this :
>>>>>>>>>>>>> MPI_Init () ( log_mpi_core.c )
>>>>>>>>>>>>> -- > PMPI_Init ( the communicator object is created here )
>>>>>>>>>>>>> -- > MPE_Init_log ()
>>>>>>>>>>>>> -- > CLOG_Local_init()
>>>>>>>>>>>>> -- > CLOG_Buffer_init4write ()
>>>>>>>>>>>>> -- > CLOG_Preamble_env_init()
>>>>>>>>>>>>> -- > MPI_Bcast () (bcast.c)
>>>>>>>>>>>>> -- > MPIR_Bcast ()
>>>>>>>>>>>>> -- > MPIC_Recv () /
>>>>>>>>>>>>> MPIC_Send()
>>>>>>>>>>>>> -- > MPIC_Wait()
>>>>>>>>>>>>> < Program crashes >
>>>>>>>>>>>>> The MPIC_Wait function is invoking the progress engine,
>>>>> which
>>>>>>>>>>>>> works properly without the mpe component.
>>>>>>>>>>>>> Even within the progress engine, MPIDU_Sock_wait() and
>>>>>>>>>>>>> MPIDI_CH3I_Progress_handle_sock_event() are executed a couple
>>>>> of
>>>>>>>>>>>>> times before the program crashes in the
>>>>> MPIDU_Socki_handle_read()
>>>>>>>>>>>>> or the MPIDU_Socki_handle_write() functions. ( The read() and
>>>>> the
>>>>>>>>>>>>> write() functions work two times, I think)
>>>>>>>>>>>>> I am finding it very hard to reason why the program
>>>>> crashes
>>>>>>>>>>>>> with mpe. Could you please suggest where I need to look at to
>>>>> sort
>>>>>>>>>>>>> this issue out?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> Krishna Chaitanya K
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Wed, Mar 19, 2008 at 2:20 AM, Anthony Chan <
>> chan at mcs.anl.gov
>>>>>>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Wed, 19 Mar 2008, Krishna Chaitanya wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>> I tried configuring MPICH2 by doing :
>>>>>>>>>>>>>> ./configure --prefix=/home/kc/mpich-install/ --enable-mpe
>>>>>>>>>>>>>> --with-logging=SLOG CC=gcc CFLAGS=-g && make && make
>>>>> install
>>>>>>>>>>>>>> It flashed an error messaage saying :
>>>>>>>>>>>>>> onfigure: error: ./src/util/logging/SLOG does not exist.
>>>>>>>>>>>>> Configure aborted
>>>>>>>>>>>>>
>>>>>>>>>>>>> The --with-logging is for MPICH2's internal logging, not MPE's
>>>>>>>>>>>>> logging.
>>>>>>>>>>>>> As what you did below is fine is fine.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> After that, I tried :
>>>>>>>>>>>>>> ./configure --prefix=/home/kc/mpich-install/ --enable-mpe
>>>>> CC=gcc
>>>>>>>>>>>>> CFLAGS=-g
>>>>>>>>>>>>>> && make && make install
>>>>>>>>>>>>>> The installation was normal, when I tried compiling
>> an
>>>>>>>>>>>>> example
>>>>>>>>>>>>>> program by doing :
>>>>>>>>>>>>>> mpicc -mpilog -o sample sample.c
>>>>>>>>>>>>>> cc1: error: unrecognized command line option "-mpilog"
>>>>>>>>>>>>>
>>>>>>>>>>>>> Do "mpicc -mpe=mpilog -o sample sample.c" instead. For more
>>>>>>>> details,
>>>>>>>>>>>>> see "mpicc -mpe=help" and see mpich2/src/mpe2/README.
>>>>>>>>>>>>>
>>>>>>>>>>>>> A.Chan
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Can anyone please tell me what needs to be done to
>>>>> use
>>>>>>>>>>>>> the SLOG
>>>>>>>>>>>>>> logging format?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>> Krishna Chaitanya K
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> --
>>>>>>>>>>>>>> In the middle of difficulty, lies opportunity
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> --
>>>>>>>>>>>>> In the middle of difficulty, lies opportunity
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> --
>>>>>>>>>>>>> In the middle of difficulty, lies opportunity
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> --
>>>>>>>>>>>>> In the middle of difficulty, lies opportunity
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> In the middle of difficulty, lies opportunity
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> In the middle of difficulty, lies opportunity
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> In the middle of difficulty, lies opportunity
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> In the middle of difficulty, lies opportunity
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> In the middle of difficulty, lies opportunity
>>>>
>>>
>>>
>>>
>>> --
>>> In the middle of difficulty, lies opportunity
>>>
>>
>>
>
>
> --
> In the middle of difficulty, lies opportunity
>
More information about the mpich-discuss
mailing list