[mpich-discuss] Clog2 conversion to Slog2 fails
Duro, Joao
j.a.duro at cranfield.ac.uk
Wed Apr 21 11:50:47 CDT 2010
So using mpe2 for compilation:
./mpe2/mpe2-1.1.1/bin/mpecc -mpilog -o matrix_matrix_log matrix_matrix_log.c
I also changed the code in order for MPE to generate me the eventIDs
http://pastebin.com/ZWsq7gFr
I run it using the following command:
bsub -n5 -I mpirun -srun matrix_matrix_log
I got segmentation fault :(
<<Starting on lsfhost.localdomain>>
Enabling the Default clock synchronization...
srun: error: comp48: task[0,2]: Segmentation fault (core dumped)
srun: Terminating job
srun: error: comp48: task0: Exited with exit code 1
GDB --> http://pastebin.com/Zqru7gUt
So I decide to try cpilog.c and here is the code:
http://pastebin.com/FtTXtTfN
I compile it:
./mpe2/mpe2-1.1.1/bin/mpecc -mpilog -o cpilog cpilog.c
and I run it:
bsub -n5 -I mpirun -srun cpilog
And amazingly I got the same segmentation fault:
<<Starting on lsfhost.localdomain>>
Process 4 running on comp48
Process 0 running on comp20
Process 3 running on comp48
Process 1 running on comp20
Process 2 running on comp20
Enabling the Default clock synchronization...
pi is approximately 3.1415926535899197, Error is 0.0000000000001266
wall clock time = 0.036734
srun: error: comp20: task2: Segmentation fault (core dumped)
srun: Terminating job
srun: error: comp20: task0: Exited with exit code 1
GDB --> http://pastebin.com/kFaDji37
The segmentation fault is related to CLOG_Buffer_save_bareevt.
Any help would be welcome!
Thanks,
Joao
________________________________________
De: chan at mcs.anl.gov [chan at mcs.anl.gov]
Enviado: quarta-feira, 21 de Abril de 2010 16:17
Para: Duro, Joao
Cc: mpich-discuss at mcs.anl.gov
Assunto: Re: [mpich-discuss] Clog2 conversion to Slog2 fails
----- "Joao Duro" <j.a.duro at cranfield.ac.uk> wrote:
> Right now is:
> mpicc -I/lustre/scratch/c119470/mpe2/mpe2-1.1.1/include/
> matrix_matrix_log.c -o matrix_matrix_log -lpthread
> -L/lustre/scratch/c119470/mpe2/mpe2-1.1.1/lib/ -lmpe
Since you only link with -lmpe, your link command is essentially "mpecc -log".
To enable MPI logging, do
mpecc -mpilog -o matrix_matrix_log matrix_matrix_log.c
> 2) http://pastebin.com/8VNGgTTZ
Your C program hardwired MPE eventID in MPE_Describe_event().
It is the reason of the corrupted clog2 file. In the README file
in mpe2, under section VI) CUSTOMIZING LOGFILES, you should
see a description of correct way of using MPE logging functions.
BTW, you also need to understand when MPE_Init_log()/MPE_Finish_log()
should be used in your program (that depends if -mpilog or -log
is used with mpecc).
Basically, you need to use MPE_Log_get_state_eventIDs() to get valid
MPE eventIDs for MPE_Describe_event() and MPE_Log_event()... For a
complete C example, you can look at
<mpe2_install_dir>/share/examples_logging/cpilog.c
for details.
A.Chan
More information about the mpich-discuss
mailing list