[mpich-discuss] Problem with MPE and MPICH2
Manuel Holtgrewe
holtgrewe at ira.uka.de
Sun Mar 1 10:32:00 CST 2009
Okay, my problem seems to be that I am linking the program with
-mpe=mpilog. Maybe this causes MPI_Finalize() to finalize the logs a
second time.
The workaround is: Drop the MPE_Init_log() and and MPE_Finalize_log()
calls. The filename can then be selected with an environment variable
like this:
$ MPE_LOGFILE_PREFIX=myfile mpiexec -n 4 ./mpe
I know that this is the list on mpich and not MPE. However, if I am
right and this caused by finalizing the logging two times, then I
would say that it is unexpected behaviour for a program to work when
not being linked with -mpe=mpilog and failing to work when this is
activated. Some might consider this a bug.
Bests,
-- Manuel
2009/3/1 Manuel Holtgrewe <holtgrewe at ira.uka.de>:
> Hi,
>
> I have a problem using MPE with MPICH2. The following C++ program:
>
> --8<--------
> #include <cstdio>
>
> #include <mpi.h>
> #include <mpe.h>
>
> int main(int argc, char **argv)
> {
> MPI_Init(&argc, &argv);
> int rank;
> MPI_Comm_rank(MPI_COMM_WORLD, &rank);
> printf("Rank: %d\n", rank);
>
> int res = MPE_Init_log();
> printf("R%02d, MPE_Init_log() == %d\n", rank, res);
>
> res = MPE_Finish_log("foo");
> printf("R%02d, MPE_Finish_log(\"foo\") == %d\n", rank, res);
> MPI_Finalize();
> return 0;
> }
> --8<--------
>
> Crashes as follows:
>
> $ mpiexec -n 1 ./mpe
> Rank: 0
> R00, MPE_Init_log() == 0
> Enabling the Default clock synchronization...
> R00, MPE_Finish_log("foo") == 0
> rank 0 in job 51 **HOST**_53437 caused collective abort of all ranks
> exit status of rank 0: killed by signal 10
>
> $ mpiexec -n 2 ./mpe
> Rank: 0
> R00, MPE_Init_log() == 0
> Rank: 1
> R01, MPE_Init_log() == 0
> Enabling the Default clock synchronization...
> R00, MPE_Finish_log("foo") == 0
> R01, MPE_Finish_log("foo") == 0
> rank 0 in job 52 **HOST**_53437 caused collective abort of all ranks
> exit status of rank 0: killed by signal 10
>
> If I remove the "MPE_Finish_log()" line, the program does not crash.
>
> The problem occurs on Mac Os X and Linux using mpich2 1.0.8 with g++ 4.3.
>
> Bests,
> -- Manuel
>
More information about the mpich-discuss
mailing list