[mpich-discuss] MPE logging with OpenMPI

Anthony Chan chan at mcs.anl.gov
Thu Apr 1 16:21:27 CDT 2010


Brian,

Do you have a small/simple program that shows the problem ?
My guess is that this MPE's MPI_Pcontrol bug may have something
to do with MPE's internal buffer.  BTW, did you try your program
with MPICH2's MPE, i.e. does the same problem occur with 
MPICH2's MPE ?

A.Chan

----- "Brian Wainscott" <brian at lstc.com> wrote:

> I have a Fortran application running with OpenMPI 1.4 and am trying to
> use
> mpe2-1.1.1 and jumpshot to do some program analysis.
> 
> My issue is with MPI_Pcontrol.  I REALLY don't want logging on the
> whole time --
> there is just too much stuff.  I want to start with it off, run for a
> while, turn
> it on briefly, then terminate.
> 
> The problem is, if I call MPI_Pcontrol very early on, then I end up
> with an error
> like this after I call MPI_Pcontrol(1,ierr):
> 
> ^@clog_commset.c:CLOG_CommSet_get_IDs() -
>         PMPI_Comm_get_attr() fails!
> Backtrace of the callstack at rank 3:
> ^@      At [0]: program(CLOG_Util_abort+0x92)[0x4006a06]
> ^@      At [1]: program(CLOG_CommSet_get_IDs+0x5f)[0x4002ad3]
> ^@      At [2]: program(MPI_Isend+0x279)[0x3fd9e0e]
> ^@      At [3]: program(mpi_isend_+0x6f)[0x3fbfbe0]
> 
> Strangely, if I wait until a much later point in the program and call
> MPI_Pcontrol(0,ierr), then it does seem to turn logging off, and I
> don't have
> problems.  But if I call it too soon, I get this error.  If I don't
> call it at
> all, of course things work fine too.
> 
> The functions I'm calling (and the order I'm calling them in) are:
> 
> MPI_INIT
> (MPI_Pcontrol -- turning it off here causes errors later)
> MPE_Log_get_state_eventIDs
> MPE_Describe_state
> MPE_Log_get_solo_eventID
> MPE_Describe_event
> (MPI_Pcontrol -- turning it off here causes errors later)
> < let the program run through some of initialization>
> (MPI_Pcontrol -- turning it off HERE causes errors later)
> < let the program finish initialization and start cycling)
> MPI_Pcontrol -- turning it off here WORKS
> < let the program run for a while>
> MPI_Pcontrol(1,ierr) to turn on logging
> ..... execution, including calls to MPE_Log_event
> MPI_Pcontrol(0,ierr) to turn logging off
> <end program>
> 
> This LOOKs like a bug in MPE to me -- like something is not being
> properly
> initialized or processed while logging is off, but which is later
> assumed to have
> been done?
> 
> I also tried going into the mce source code and changing some of the
> initial
> flags so that logging was off by default, but that caused the same
> error as
> calling MPI_Pcontrol very early.
> 
> So, am I doing something wrong (and what)?  Or who can help fix this
> issue?
> 
> Thanks!
> 
> _______________________________________________
> mpich-discuss mailing list
> mpich-discuss at mcs.anl.gov
> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss


More information about the mpich-discuss mailing list