[mpich-discuss] Trouble in getting the logging interface to work - multichannel/mpi.c is only for windows

Jayesh Krishna jayesh at mcs.anl.gov
Mon Mar 24 09:21:58 CDT 2008


Hi,
 The source file "src/util/multichannel/mpi.c" is used only for MPICH2 on
windows. The code is responsible for loading the right wrapper (MPE) and
mpich2 (depending on the channel, ssm/sshm/sock/mt etc, specified when
launching your job) dlls. This code is not compiled when you compile MPICH2
on unix systems.

(PS: The unix code currently loads channels dynamically in a better way. You
should look into "/src/mpid/ch3/channels/dllchan" for more info.)
 
Regards,
Jayesh

  _____  

From: owner-mpich-discuss at mcs.anl.gov
[mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of Krishna Chaitanya
Sent: Monday, March 24, 2008 9:13 AM
To: mpich-discuss at mcs.anl.gov
Subject: Re: [mpich-discuss] Trouble in getting the logging interface to
work


   Sorry for re-posting.
>I took a look at the documentation at src/util/multichannel/mpi.c
           Guess this is only for windows. 
           It would be of great help if someone could point me to the
function that takes care of mapping MPI_Init to its wrapper, defined in
src/mpe2/src/wrappers/src/log_mpi_core.c, when the library is compiled with
the --enable-mpe switch., instead of the function defined in
src/mpi/init/init.c

Krishna Chaitanya K 


On Sun, Mar 23, 2008 at 2:29 PM, Krishna Chaitanya <kris.c1986 at gmail.com>
wrote:


> See section "CUSTOMIZING LOGFILES" in mpich2-xxx/src/mpe2/README.

Correct me if I am wrong : 
Since I am dealing with PERUSE events, whenever such an event occurs, a
PERUSE function, defined in <mpich-dir>/src/peruse/peruse.c, is invoked by
the MPI library. I am trying to get this event displayed in the jumpshot
output. For this to be done, I need to define a wrapper function which gets
invoked when a PERUSE event occurs, to log the event and then to call the
actuall peruse function, which is similar to the way the wrapper function at
log_mpi_core.c is called, when MPI_Init is called. 

Could you please clarify on the dynamic mapping? 
I took a look at the documentation at src/util/multichannel/mpi.c. I think,
I understood what is going on in LoadFunctions() and the way the function
pointers are assigned addresses depending the dll that is being used. 

Krishna Chaitanya K 


On Sun, Mar 23, 2008 at 12:59 PM, Anthony Chan <chan at mcs.anl.gov> wrote:



See section "CUSTOMIZING LOGFILES" in mpich2-xxx/src/mpe2/README.
You don't need to modify MPE libraries.

A.Chan


On Sun, 23 Mar 2008, Krishna Chaitanya wrote:

> I have modified the mpe library to log the events that I am interested in
> monitoring. But, I am bit hazy about how a function like MPI_Init is
> actually linked to the MPI_Init routine in the file log_mpi_core.c when we
> compile the MPI application with the -mpe=mpilog switch. Could someone
point
> me to the routine that takes care of such a mapping?
>
> Thanks,
> Krishna Chaitanya K
>
> On Sat, Mar 22, 2008 at 3:01 AM, Krishna Chaitanya <kris.c1986 at gmail.com>
> wrote:
>
>> Thanks a lot. I installed the latest jdk version and I am now able to
look
>> at the jumpshot output.
>>
>> Krishna Chaitanya K
>>
>>
>> On Sat, Mar 22, 2008 at 1:45 AM, Anthony Chan <chan at mcs.anl.gov> wrote:
>>
>>>
>>> The error that you showed earlier does not suggest the problem is with
>>> running jumpshot on your machine with limited memory.  If your clog2
>>> file
>>> isn't too bad, send it to me.
>>>
>>> On Fri, 21 Mar 2008, Krishna Chaitanya wrote:
>>>
>>>> I resolved that issue.
>>>> My comp ( Intel centrino 32 bit , 256 MB RAM - Dated, I agree) hangs
>>> each
>>>> time I launch jumpshot with the slogfile. Since this is an independent
>>>> project, I am constrained when it comes to the availability of
>>> machines.
>>>> Would you recommend that I give it a try on a 64bit AMD, 512MB RAM? (
>>> Will
>>>> have to start from installing linux on this machine. Is it worth the
>>> effort
>>>> ?) If it requires higher configuration, would you please suggest a
>>> lighter
>>>> graphical tool that I can use to present the  occurrence of events and
>>> the
>>>> corresponding times?
>>>>
>>>> Thanks,
>>>> Krishna Chaitanya K
>>>>
>>>> On Fri, Mar 21, 2008 at 8:23 PM, Anthony Chan <chan at mcs.anl.gov>
>>> wrote:
>>>>
>>>>>
>>>>>
>>>>> On Fri, 21 Mar 2008, Krishna Chaitanya wrote:
>>>>>
>>>>>>
>>>>>> The file block pointer to the Tree Directory is NOT initialized!,
>>> can't
>>>>> read
>>>>>> it.
>>>>>>
>>>>>
>>>>> That means the slog2 file isn't generated completely.  Something went
>>>>> wrong in the convertion process (assuming your clog2 file is
>>> complete).
>>>>> If your MPI program doesn't finish MPI_Finalize normally, your clog2
>>>>> file will be incomplete.
>>>>>
>>>>>>
>>>>>>         IS there any environment variable that needs to be
>>> initialsed?
>>>>>
>>>>> Nothing needs to be initialized by hand.
>>>>>
>>>>> A.Chan
>>>>>>
>>>>>> Thanks,
>>>>>> Krishna Chaitanya K
>>>>>>
>>>>>>
>>>>>> On Thu, Mar 20, 2008 at 4:56 PM, Dave Goodell <goodell at mcs.anl.gov>
>>>>> wrote:
>>>>>>
>>>>>>> It's pretty hard to debug this issue via email.  However, you could
>>>>>>> try running valgrind on your modified MPICH2 to see if any obvious
>>>>>>> bugs pop out.  When you do, make sure that you configure with "--
>>>>>>> enable-g=dbg,meminit" in order to avoid spurious warnings and to be
>>>>>>> able to see stack traces.
>>>>>>>
>>>>>>> -Dave
>>>>>>>
>>>>>>> On Mar 19, 2008, at 1:05 PM, Krishna Chaitanya wrote:
>>>>>>>
>>>>>>>> The problem seems to be with the communicator in MPI_Bcast()
>>> (/src/
>>>>>>>> mpi/coll/bcast.c).
>>>>>>>> The comm_ptr is initialized to NULL and after a call to
>>>>>>>> MPID_Comm_get_ptr( comm, comm_ptr ); , the comm_ptr points to the
>>>>>>>> communicator object which was created throught MPI_Init().
>>>>>>>> However,  MPID_Comm_valid_ptr( comm_ptr, mpi_errno ) returns with
>>> a
>>>>>>>> value other than MPI_SUCCESS.
>>>>>>>> During some traces, it used to crash at this point itself. On some
>>>>>>>> other traces, it used to go into the progress engine as I
>>> described
>>>>>>>> in my previous mails.
>>>>>>>>
>>>>>>>> What could be the reason? Hope someone chips in. I havent been
>>> able
>>>>>>>> to figure this out for sometime now.
>>>>>>>>
>>>>>>>> Krishna Chaitanya K
>>>>>>>>
>>>>>>>> On Wed, Mar 19, 2008 at 8:44 AM, Krishna Chaitanya
>>>>>>>> <kris.c1986 at gmail.com> wrote:
>>>>>>>> This might help :
>>>>>>>>
>>>>>>>> In the MPID_Comm structure, I have included the following line for
>>>>>>>> the peruse place-holder :
>>>>>>>>  struct mpich_peruse_handle_t** c_peruse_handles;
>>>>>>>>
>>>>>>>> And in the function, MPID_Init_thread(), i have the line
>>>>>>>>  MPIR_Process.comm_world->c_peruse_handles = NULL;
>>>>>>>>  when the rest of the members of the comm_world structure are
>>> being
>>>>>>>> populated.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Krishna Chaitanya K
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Mar 19, 2008 at 8:19 AM, Krishna Chaitanya
>>>>>>>> <kris.c1986 at gmail.com> wrote:
>>>>>>>> Thanks for the help. I am facing an weird problem right now. To
>>>>>>>> incorporate the PERUSE component, I have modified the communicator
>>>>>>>> data structure to incude the PERUSE handles. The program executes
>>>>>>>> as expected when compiled without the "mpe=mpilog" flag.When I
>>>>>>>> compile it with the mpe component, the program gives this output :
>>>>>>>>
>>>>>>>> Fatal error in MPI_Bcast: Invalid communicator, error stack:
>>>>>>>> MPI_Bcast(784): MPI_Bcast(buf=0x9260f98, count=1, MPI_INT, root=0,
>>>>>>>> MPI_COMM_WORLD) failed
>>>>>>>> MPI_Bcast(717): Invalid communicator
>>>>>>>>
>>>>>>>> On tracing further, I understood this :
>>>>>>>> MPI_Init () (  log_mpi_core.c )
>>>>>>>>  -- >  PMPI_Init ( the communicator object is created here )
>>>>>>>>  -- >  MPE_Init_log ()
>>>>>>>>         -- > CLOG_Local_init()
>>>>>>>>               -- > CLOG_Buffer_init4write ()
>>>>>>>>                     -- > CLOG_Preamble_env_init()
>>>>>>>>                           -- >   MPI_Bcast ()  (bcast.c)
>>>>>>>>                                   -- > MPIR_Bcast ()
>>>>>>>>                                          -- >  MPIC_Recv ()  /
>>>>>>>> MPIC_Send()
>>>>>>>>                                          -- >  MPIC_Wait()
>>>>>>>>                                       < Program crashes >
>>>>>>>>      The MPIC_Wait function is invoking the progress engine, which
>>>>>>>> works properly without the mpe component.
>>>>>>>>       Even within the progress engine, MPIDU_Sock_wait() and
>>>>>>>> MPIDI_CH3I_Progress_handle_sock_event() are executed a couple of
>>>>>>>> times before the program crashes in the MPIDU_Socki_handle_read()
>>>>>>>> or the MPIDU_Socki_handle_write() functions. ( The read() and the
>>>>>>>> write() functions work two times, I think)
>>>>>>>>      I am finding it very hard to reason why the program crashes
>>>>>>>> with mpe. Could you please suggest where I need to look at to sort
>>>>>>>> this issue out?
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Krishna Chaitanya K
>>>>>>>>
>>>>>>>> On Wed, Mar 19, 2008 at 2:20 AM, Anthony Chan <chan at mcs.anl.gov>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, 19 Mar 2008, Krishna Chaitanya wrote:
>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>          I tried configuring MPICH2 by doing :
>>>>>>>>> ./configure --prefix=/home/kc/mpich-install/ --enable-mpe
>>>>>>>>> --with-logging=SLOG  CC=gcc CFLAGS=-g   && make && make install
>>>>>>>>>          It  flashed an error messaage saying :
>>>>>>>>> onfigure: error: ./src/util/logging/SLOG does not exist.
>>>>>>>> Configure aborted
>>>>>>>>
>>>>>>>> The --with-logging is for MPICH2's internal logging, not MPE's
>>>>>>>> logging.
>>>>>>>> As what you did below is fine is fine.
>>>>>>>>>
>>>>>>>>>          After that, I tried :
>>>>>>>>> ./configure --prefix=/home/kc/mpich-install/ --enable-mpe CC=gcc
>>>>>>>> CFLAGS=-g
>>>>>>>>> && make && make install
>>>>>>>>>         The installation was normal, when I tried compiling an
>>>>>>>> example
>>>>>>>>> program by doing :
>>>>>>>>> mpicc -mpilog -o sample  sample.c
>>>>>>>>> cc1: error: unrecognized command line option "-mpilog"
>>>>>>>>
>>>>>>>> Do "mpicc -mpe=mpilog -o sample sample.c" instead.  For more
>>> details,
>>>>>>>> see "mpicc -mpe=help" and see mpich2/src/mpe2/README.
>>>>>>>>
>>>>>>>> A.Chan
>>>>>>>>
>>>>>>>>>
>>>>>>>>>          Can anyone please tell me what needs to be done to use
>>>>>>>> the SLOG
>>>>>>>>> logging format?
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Krishna Chaitanya K
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> In the middle of difficulty, lies opportunity
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> In the middle of difficulty, lies opportunity
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> In the middle of difficulty, lies opportunity
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> In the middle of difficulty, lies opportunity
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> In the middle of difficulty, lies opportunity
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> In the middle of difficulty, lies opportunity
>>>>
>>>
>>>
>>
>>
>> --
>> In the middle of difficulty, lies opportunity
>>
>
>
>
> --
> In the middle of difficulty, lies opportunity
>






-- 

In the middle of difficulty, lies opportunity 




-- 
In the middle of difficulty, lies opportunity 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20080324/306e2abd/attachment.htm>


More information about the mpich-discuss mailing list