[mpich-discuss] MPE logging with OpenMPI

Brian Wainscott brian at lstc.com
Thu Apr 1 15:39:31 CDT 2010


I have a Fortran application running with OpenMPI 1.4 and am trying to use
mpe2-1.1.1 and jumpshot to do some program analysis.

My issue is with MPI_Pcontrol.  I REALLY don't want logging on the whole time --
there is just too much stuff.  I want to start with it off, run for a while, turn
it on briefly, then terminate.

The problem is, if I call MPI_Pcontrol very early on, then I end up with an error
like this after I call MPI_Pcontrol(1,ierr):

^@clog_commset.c:CLOG_CommSet_get_IDs() -
        PMPI_Comm_get_attr() fails!
Backtrace of the callstack at rank 3:
^@      At [0]: program(CLOG_Util_abort+0x92)[0x4006a06]
^@      At [1]: program(CLOG_CommSet_get_IDs+0x5f)[0x4002ad3]
^@      At [2]: program(MPI_Isend+0x279)[0x3fd9e0e]
^@      At [3]: program(mpi_isend_+0x6f)[0x3fbfbe0]

Strangely, if I wait until a much later point in the program and call
MPI_Pcontrol(0,ierr), then it does seem to turn logging off, and I don't have
problems.  But if I call it too soon, I get this error.  If I don't call it at
all, of course things work fine too.

The functions I'm calling (and the order I'm calling them in) are:

MPI_INIT
(MPI_Pcontrol -- turning it off here causes errors later)
MPE_Log_get_state_eventIDs
MPE_Describe_state
MPE_Log_get_solo_eventID
MPE_Describe_event
(MPI_Pcontrol -- turning it off here causes errors later)
< let the program run through some of initialization>
(MPI_Pcontrol -- turning it off HERE causes errors later)
< let the program finish initialization and start cycling)
MPI_Pcontrol -- turning it off here WORKS
< let the program run for a while>
MPI_Pcontrol(1,ierr) to turn on logging
..... execution, including calls to MPE_Log_event
MPI_Pcontrol(0,ierr) to turn logging off
<end program>

This LOOKs like a bug in MPE to me -- like something is not being properly
initialized or processed while logging is off, but which is later assumed to have
been done?

I also tried going into the mce source code and changing some of the initial
flags so that logging was off by default, but that caused the same error as
calling MPI_Pcontrol very early.

So, am I doing something wrong (and what)?  Or who can help fix this issue?

Thanks!



More information about the mpich-discuss mailing list