[mpich-discuss] IMB 3.1 with TOL 0 crashes on Allreduce
Jayesh Krishna
jayesh at mcs.anl.gov
Tue May 27 12:23:24 CDT 2008
Hi,
I tried running the IMB 3.1 suite for allreduce on a single machine with
upto 8 procs and did not get any errors.
1) Make sure that both node-1 & node-2 have the same data model (data type
representation). Please note that MPICH2 currently does not support
heterogeneous systems (wrt the data models used by the machines, for eg:
you cannot run MPI procs across x86 and x64 machines). If you need to run
your program across a heterogeneous system please use MPICH1 instead.
2) Try running the benchmark on a single node/host (mpiexec -n 2
imb-mpi1.exe allreduce) and let us know the results.
3) Are you able to run other tests in the IMB 3.1 suite ?
Regards,
Jayesh
-----Original Message-----
From: owner-mpich-discuss at mcs.anl.gov
[mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of Calin Iaru
Sent: Monday, May 26, 2008 5:50 AM
To: mpich-discuss at mcs.anl.gov
Subject: [mpich-discuss] IMB 3.1 with TOL 0 crashes on Allreduce
The problem is that the latest mpich2 in combination with IMB 3.1
generates a data corruption error when running on 2 nodes. IMB was
compiled with the CHECK flag and TOL set to 0 inside IMB_declare.h. I am
not sure if this is a transport error or a verification error; it could be
that the problem lies in the application code.
E:\Program Files\MPICH2\bin>mpiexec.exe -hosts 2 node-1 node-2
\\node-1\e$\imb-mpi1.exe allreduce
#---------------------------------------------------
# Intel (R) MPI Benchmark Suite V3.1, MPI-1 part
#---------------------------------------------------
# Date : Fri May 23 14:44:12 2008
# Machine : x86 Family 15 Model 4 Stepping 1, GenuineIntel
# System : Windows 2003
# Release : 5.2.3790
# Version : Service Pack 1
# MPI Version : 2.0
# MPI Thread Environment: MPI_THREAD_SINGLE
# Calling sequence was:
# \\node-1\e$\imb-mpi1.exe allreduce
# Minimum message length in bytes: 0
# Maximum message length in bytes: 4194304
#
# MPI_Datatype : MPI_BYTE
# MPI_Datatype for reductions : MPI_FLOAT
# MPI_Op : MPI_SUM
#
#
# List of Benchmarks to run:
# Allreduce
#-------------------------------------------------------------------------
----
# Benchmarking Allreduce
# #processes = 2
#-------------------------------------------------------------------------
----
#bytes #repetitions t_min[usec] t_max[usec] t_avg[usec]
defects
0 1000 0.51 0.52
0.51 0.00
4 1000 80.30 80.35
80.33 0.00
1: Error Allreduce, size = 8, sample #0
Process 1: Got invalid buffer:
Buffer entry: 2.300000
0: Error Allreduce, size = 8, sample #0
Process 0: Got invalid buffer:
Buffer entry: 2.300000
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20080527/e744b504/attachment.htm>
More information about the mpich-discuss
mailing list