[mpich-discuss] MPE logging with OpenMPI
Anthony Chan
chan at mcs.anl.gov
Thu Apr 1 16:21:27 CDT 2010
Brian,
Do you have a small/simple program that shows the problem ?
My guess is that this MPE's MPI_Pcontrol bug may have something
to do with MPE's internal buffer. BTW, did you try your program
with MPICH2's MPE, i.e. does the same problem occur with
MPICH2's MPE ?
A.Chan
----- "Brian Wainscott" <brian at lstc.com> wrote:
> I have a Fortran application running with OpenMPI 1.4 and am trying to
> use
> mpe2-1.1.1 and jumpshot to do some program analysis.
>
> My issue is with MPI_Pcontrol. I REALLY don't want logging on the
> whole time --
> there is just too much stuff. I want to start with it off, run for a
> while, turn
> it on briefly, then terminate.
>
> The problem is, if I call MPI_Pcontrol very early on, then I end up
> with an error
> like this after I call MPI_Pcontrol(1,ierr):
>
> ^@clog_commset.c:CLOG_CommSet_get_IDs() -
> PMPI_Comm_get_attr() fails!
> Backtrace of the callstack at rank 3:
> ^@ At [0]: program(CLOG_Util_abort+0x92)[0x4006a06]
> ^@ At [1]: program(CLOG_CommSet_get_IDs+0x5f)[0x4002ad3]
> ^@ At [2]: program(MPI_Isend+0x279)[0x3fd9e0e]
> ^@ At [3]: program(mpi_isend_+0x6f)[0x3fbfbe0]
>
> Strangely, if I wait until a much later point in the program and call
> MPI_Pcontrol(0,ierr), then it does seem to turn logging off, and I
> don't have
> problems. But if I call it too soon, I get this error. If I don't
> call it at
> all, of course things work fine too.
>
> The functions I'm calling (and the order I'm calling them in) are:
>
> MPI_INIT
> (MPI_Pcontrol -- turning it off here causes errors later)
> MPE_Log_get_state_eventIDs
> MPE_Describe_state
> MPE_Log_get_solo_eventID
> MPE_Describe_event
> (MPI_Pcontrol -- turning it off here causes errors later)
> < let the program run through some of initialization>
> (MPI_Pcontrol -- turning it off HERE causes errors later)
> < let the program finish initialization and start cycling)
> MPI_Pcontrol -- turning it off here WORKS
> < let the program run for a while>
> MPI_Pcontrol(1,ierr) to turn on logging
> ..... execution, including calls to MPE_Log_event
> MPI_Pcontrol(0,ierr) to turn logging off
> <end program>
>
> This LOOKs like a bug in MPE to me -- like something is not being
> properly
> initialized or processed while logging is off, but which is later
> assumed to have
> been done?
>
> I also tried going into the mce source code and changing some of the
> initial
> flags so that logging was off by default, but that caused the same
> error as
> calling MPI_Pcontrol very early.
>
> So, am I doing something wrong (and what)? Or who can help fix this
> issue?
>
> Thanks!
>
> _______________________________________________
> mpich-discuss mailing list
> mpich-discuss at mcs.anl.gov
> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
More information about the mpich-discuss
mailing list