[mpich-discuss] Trouble in getting the logging interface to work

Anthony Chan chan at mcs.anl.gov
Fri Mar 21 09:53:06 CDT 2008



On Fri, 21 Mar 2008, Krishna Chaitanya wrote:

>
> The file block pointer to the Tree Directory is NOT initialized!, can't read
> it.
>

That means the slog2 file isn't generated completely.  Something went 
wrong in the convertion process (assuming your clog2 file is complete).
If your MPI program doesn't finish MPI_Finalize normally, your clog2
file will be incomplete.

>
>         IS there any environment variable that needs to be initialsed?

Nothing needs to be initialized by hand.

A.Chan
>
> Thanks,
> Krishna Chaitanya K
>
>
> On Thu, Mar 20, 2008 at 4:56 PM, Dave Goodell <goodell at mcs.anl.gov> wrote:
>
>> It's pretty hard to debug this issue via email.  However, you could
>> try running valgrind on your modified MPICH2 to see if any obvious
>> bugs pop out.  When you do, make sure that you configure with "--
>> enable-g=dbg,meminit" in order to avoid spurious warnings and to be
>> able to see stack traces.
>>
>> -Dave
>>
>> On Mar 19, 2008, at 1:05 PM, Krishna Chaitanya wrote:
>>
>>> The problem seems to be with the communicator in MPI_Bcast() (/src/
>>> mpi/coll/bcast.c).
>>> The comm_ptr is initialized to NULL and after a call to
>>> MPID_Comm_get_ptr( comm, comm_ptr ); , the comm_ptr points to the
>>> communicator object which was created throught MPI_Init().
>>> However,  MPID_Comm_valid_ptr( comm_ptr, mpi_errno ) returns with a
>>> value other than MPI_SUCCESS.
>>> During some traces, it used to crash at this point itself. On some
>>> other traces, it used to go into the progress engine as I described
>>> in my previous mails.
>>>
>>> What could be the reason? Hope someone chips in. I havent been able
>>> to figure this out for sometime now.
>>>
>>> Krishna Chaitanya K
>>>
>>> On Wed, Mar 19, 2008 at 8:44 AM, Krishna Chaitanya
>>> <kris.c1986 at gmail.com> wrote:
>>> This might help :
>>>
>>> In the MPID_Comm structure, I have included the following line for
>>> the peruse place-holder :
>>>  struct mpich_peruse_handle_t** c_peruse_handles;
>>>
>>> And in the function, MPID_Init_thread(), i have the line
>>>  MPIR_Process.comm_world->c_peruse_handles = NULL;
>>>  when the rest of the members of the comm_world structure are being
>>> populated.
>>>
>>> Thanks,
>>> Krishna Chaitanya K
>>>
>>>
>>>
>>> On Wed, Mar 19, 2008 at 8:19 AM, Krishna Chaitanya
>>> <kris.c1986 at gmail.com> wrote:
>>> Thanks for the help. I am facing an weird problem right now. To
>>> incorporate the PERUSE component, I have modified the communicator
>>> data structure to incude the PERUSE handles. The program executes
>>> as expected when compiled without the "mpe=mpilog" flag.When I
>>> compile it with the mpe component, the program gives this output :
>>>
>>> Fatal error in MPI_Bcast: Invalid communicator, error stack:
>>> MPI_Bcast(784): MPI_Bcast(buf=0x9260f98, count=1, MPI_INT, root=0,
>>> MPI_COMM_WORLD) failed
>>> MPI_Bcast(717): Invalid communicator
>>>
>>> On tracing further, I understood this :
>>> MPI_Init () (  log_mpi_core.c )
>>>  -- >  PMPI_Init ( the communicator object is created here )
>>>  -- >  MPE_Init_log ()
>>>         -- > CLOG_Local_init()
>>>               -- > CLOG_Buffer_init4write ()
>>>                     -- > CLOG_Preamble_env_init()
>>>                           -- >   MPI_Bcast ()  (bcast.c)
>>>                                   -- > MPIR_Bcast ()
>>>                                          -- >  MPIC_Recv ()  /
>>> MPIC_Send()
>>>                                          -- >  MPIC_Wait()
>>>                                       < Program crashes >
>>>      The MPIC_Wait function is invoking the progress engine, which
>>> works properly without the mpe component.
>>>       Even within the progress engine, MPIDU_Sock_wait() and
>>> MPIDI_CH3I_Progress_handle_sock_event() are executed a couple of
>>> times before the program crashes in the MPIDU_Socki_handle_read()
>>> or the MPIDU_Socki_handle_write() functions. ( The read() and the
>>> write() functions work two times, I think)
>>>      I am finding it very hard to reason why the program crashes
>>> with mpe. Could you please suggest where I need to look at to sort
>>> this issue out?
>>>
>>> Thanks,
>>> Krishna Chaitanya K
>>>
>>> On Wed, Mar 19, 2008 at 2:20 AM, Anthony Chan <chan at mcs.anl.gov>
>>> wrote:
>>>
>>>
>>> On Wed, 19 Mar 2008, Krishna Chaitanya wrote:
>>>
>>>> Hi,
>>>>          I tried configuring MPICH2 by doing :
>>>> ./configure --prefix=/home/kc/mpich-install/ --enable-mpe
>>>> --with-logging=SLOG  CC=gcc CFLAGS=-g   && make && make install
>>>>          It  flashed an error messaage saying :
>>>> onfigure: error: ./src/util/logging/SLOG does not exist.
>>> Configure aborted
>>>
>>> The --with-logging is for MPICH2's internal logging, not MPE's
>>> logging.
>>> As what you did below is fine is fine.
>>>>
>>>>          After that, I tried :
>>>> ./configure --prefix=/home/kc/mpich-install/ --enable-mpe CC=gcc
>>> CFLAGS=-g
>>>> && make && make install
>>>>         The installation was normal, when I tried compiling an
>>> example
>>>> program by doing :
>>>> mpicc -mpilog -o sample  sample.c
>>>> cc1: error: unrecognized command line option "-mpilog"
>>>
>>> Do "mpicc -mpe=mpilog -o sample sample.c" instead.  For more details,
>>> see "mpicc -mpe=help" and see mpich2/src/mpe2/README.
>>>
>>> A.Chan
>>>
>>>>
>>>>          Can anyone please tell me what needs to be done to use
>>> the SLOG
>>>> logging format?
>>>>
>>>> Thanks,
>>>> Krishna Chaitanya K
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> In the middle of difficulty, lies opportunity
>>>>
>>>
>>>
>>>
>>>
>>> --
>>> In the middle of difficulty, lies opportunity
>>>
>>>
>>>
>>> --
>>> In the middle of difficulty, lies opportunity
>>>
>>>
>>>
>>> --
>>> In the middle of difficulty, lies opportunity
>>
>>
>
>
> -- 
> In the middle of difficulty, lies opportunity
>




More information about the mpich-discuss mailing list