[mpich-discuss] Clog2 conversion to Slog2 fails

Duro, Joao j.a.duro at cranfield.ac.uk
Wed Apr 21 11:50:47 CDT 2010


So using mpe2 for compilation:
./mpe2/mpe2-1.1.1/bin/mpecc -mpilog -o matrix_matrix_log matrix_matrix_log.c

I also changed the code in order for MPE to generate me the eventIDs
http://pastebin.com/ZWsq7gFr

I run it using the following command:
bsub -n5 -I mpirun -srun matrix_matrix_log

I got segmentation fault :( 

<<Starting on lsfhost.localdomain>>
Enabling the Default clock synchronization...
srun: error: comp48: task[0,2]: Segmentation fault (core dumped)
srun: Terminating job
srun: error: comp48: task0: Exited with exit code 1

GDB --> http://pastebin.com/Zqru7gUt

So I decide to try cpilog.c and here is the code:
http://pastebin.com/FtTXtTfN

I compile it:
./mpe2/mpe2-1.1.1/bin/mpecc -mpilog -o cpilog cpilog.c
and I run it:
bsub -n5 -I mpirun -srun cpilog

And amazingly I got the same segmentation fault:

<<Starting on lsfhost.localdomain>>
Process 4 running on comp48
Process 0 running on comp20
Process 3 running on comp48
Process 1 running on comp20
Process 2 running on comp20
Enabling the Default clock synchronization...
pi is approximately 3.1415926535899197, Error is 0.0000000000001266
wall clock time = 0.036734
srun: error: comp20: task2: Segmentation fault (core dumped)
srun: Terminating job
srun: error: comp20: task0: Exited with exit code 1

GDB --> http://pastebin.com/kFaDji37

The segmentation fault is related to CLOG_Buffer_save_bareevt.
Any help would be welcome!

Thanks,
Joao

________________________________________
De: chan at mcs.anl.gov [chan at mcs.anl.gov]
Enviado: quarta-feira, 21 de Abril de 2010 16:17
Para: Duro, Joao
Cc: mpich-discuss at mcs.anl.gov
Assunto: Re: [mpich-discuss] Clog2 conversion to Slog2 fails

----- "Joao Duro" <j.a.duro at cranfield.ac.uk> wrote:


> Right now is:
> mpicc -I/lustre/scratch/c119470/mpe2/mpe2-1.1.1/include/
> matrix_matrix_log.c  -o matrix_matrix_log -lpthread
> -L/lustre/scratch/c119470/mpe2/mpe2-1.1.1/lib/ -lmpe

Since you only link with -lmpe, your link command is essentially "mpecc -log".
To enable MPI logging, do

mpecc -mpilog -o matrix_matrix_log matrix_matrix_log.c

> 2) http://pastebin.com/8VNGgTTZ

Your C program hardwired MPE eventID in MPE_Describe_event().
It is the reason of the corrupted clog2 file.  In the README file
in mpe2, under section VI) CUSTOMIZING LOGFILES, you should
see a description of correct way of using MPE logging functions.
BTW, you also need to understand when MPE_Init_log()/MPE_Finish_log()
should be used in your program (that depends if -mpilog or -log
is used with mpecc).

Basically, you need to use MPE_Log_get_state_eventIDs() to get valid
MPE eventIDs for MPE_Describe_event() and MPE_Log_event()...  For a
complete C example, you can look at
<mpe2_install_dir>/share/examples_logging/cpilog.c
for details.

A.Chan


More information about the mpich-discuss mailing list