[mpich-discuss] MPI_REDUCE and complex numbers

Brian Dushaw dushaw at apl.washington.edu
Fri Aug 19 15:14:24 CDT 2011


Below is a slightly revised version of Jeff's test case, it more
closely mimics the situation of my code.  I find it behaves strangely
on a 2 node, 8 processor system (my cluster is shut down at the moment; 
I can't fuss with this any more...)

This throws an error message (why?) and produces some strange results.
28800000/64000 = 450 the number of rows of the arrays.

I am using the sunstudio compiler, btw.

> time mpiexec testreduce
Fatal error in PMPI_Reduce: Message truncated, error stack:
MPIDI_CH3U_Receive_data_found(129): Message from rank 0 and tag 11 truncated; 28800000 bytes received but buffer size is 64000
  i, x(i), y(i), z(i)
  1 (0.0E+0,0.0E+0) (0.0E+0,0.0E+0) (0.0E+0,0.0E+0) (0.0E+0,0.0E+0) (0.0E+0,0.0E+0)
  (0.0E+0,0.0E+0) (0.0E+0,0.0E+0) (0.0E+0,0.0E+0) (0.0E+0,0.0E+0) (0.0E+0,0.0E+0)
  (0.0E+0,0.0E+0) (0.0E+0,0.0E+0) (0.0E+0,0.0E+0) (0.0E+0,0.0E+0) (0.0E+0,0.0E+0)
  1 (4.0,4.0) (4.0,4.0) (4.0,4.0) (4.0,4.0) (4.0,4.0) (4.0,4.0) (4.0,4.0) (4.0,4.0)
  (4.0,4.0) (0.0E+0,0.0E+0) (0.0E+0,0.0E+0) (0.0E+0,0.0E+0) (0.0E+0,0.0E+0)
  (0.0E+0,0.0E+0) (0.0E+0,0.0E+0)
  2 (0.0E+0,0.0E+0) (0.0E+0,0.0E+0) (0.0E+0,0.0E+0) (0.0E+0,0.0E+0) (0.0E+0,0.0E+0)
  (0.0E+0,0.0E+0) (0.0E+0,0.0E+0) (0.0E+0,0.0E+0) (0.0E+0,0.0E+0) (0.0E+0,0.0E+0)
  (0.0E+0,0.0E+0) (0.0E+0,0.0E+0) (0.0E+0,0.0E+0) (0.0E+0,0.0E+0) (0.0E+0,0.0E+0)
  2 (4.0,4.0) (4.0,4.0) (4.0,4.0) (4.0,4.0) (4.0,4.0) (4.0,4.0) (4.0,4.0) (4.0,4.0)
  (4.0,4.0) (0.0E+0,0.0E+0) (0.0E+0,0.0E+0) (0.0E+0,0.0E+0) (0.0E+0,0.0E+0)
  (0.0E+0,0.0E+0) (0.0E+0,0.0E+0)
  3 (0.0E+0,0.0E+0) (0.0E+0,0.0E+0) (0.0E+0,0.0E+0) (0.0E+0,0.0E+0) (0.0E+0,0.0E+0)
  (0.0E+0,0.0E+0) (0.0E+0,0.0E+0) (0.0E+0,0.0E+0) (0.0E+0,0.0E+0) (0.0E+0,0.0E+0)
  (0.0E+0,0.0E+0) (0.0E+0,0.0E+0) (0.0E+0,0.0E+0) (0.0E+0,0.0E+0) (0.0E+0,0.0E+0)
  3 (4.0,4.0) (4.0,4.0) (4.0,4.0) (4.0,4.0) (4.0,4.0) (4.0,4.0) (4.0,4.0) (4.0,4.0)
  (4.0,4.0) (0.0E+0,0.0E+0) (0.0E+0,0.0E+0) (0.0E+0,0.0E+0) (0.0E+0,0.0E+0)
  (0.0E+0,0.0E+0) (0.0E+0,0.0E+0)
  4 (0.0E+0,0.0E+0) (0.0E+0,0.0E+0) (0.0E+0,0.0E+0) (0.0E+0,0.0E+0) (0.0E+0,0.0E+0)
  (0.0E+0,0.0E+0) (0.0E+0,0.0E+0) (0.0E+0,0.0E+0) (0.0E+0,0.0E+0) (0.0E+0,0.0E+0)
  (0.0E+0,0.0E+0) (0.0E+0,0.0E+0) (0.0E+0,0.0E+0) (0.0E+0,0.0E+0) (0.0E+0,0.0E+0)
  4 (4.0,4.0) (4.0,4.0) (4.0,4.0) (4.0,4.0) (4.0,4.0) (4.0,4.0) (4.0,4.0) (4.0,4.0)
  (4.0,4.0) (0.0E+0,0.0E+0) (0.0E+0,0.0E+0) (0.0E+0,0.0E+0) (0.0E+0,0.0E+0)
  (0.0E+0,0.0E+0) (0.0E+0,0.0E+0)
  5 (0.0E+0,0.0E+0) (0.0E+0,0.0E+0) (0.0E+0,0.0E+0) (0.0E+0,0.0E+0) (0.0E+0,0.0E+0)
  (0.0E+0,0.0E+0) (0.0E+0,0.0E+0) (0.0E+0,0.0E+0) (0.0E+0,0.0E+0) (0.0E+0,0.0E+0)
  (0.0E+0,0.0E+0) (0.0E+0,0.0E+0) (0.0E+0,0.0E+0) (0.0E+0,0.0E+0) (0.0E+0,0.0E+0)
  5 (4.0,4.0) (4.0,4.0) (4.0,4.0) (4.0,4.0) (4.0,4.0) (4.0,4.0) (4.0,4.0) (4.0,4.0)
  (4.0,4.0) (0.0E+0,0.0E+0) (0.0E+0,0.0E+0) (0.0E+0,0.0E+0) (0.0E+0,0.0E+0)
  (0.0E+0,0.0E+0) (0.0E+0,0.0E+0)
  6 (1.0,1.0) (1.0,1.0) (1.0,1.0) (1.0,1.0) (1.0,1.0) (1.0,1.0) (1.0,1.0) (1.0,1.0)
  (1.0,1.0) (1.0,1.0) (1.0,1.0) (1.0,1.0) (1.0,1.0) (1.0,1.0) (1.0,1.0)
  6 (8.0,8.0) (8.0,8.0) (8.0,8.0) (8.0,8.0) (8.0,8.0) (8.0,8.0) (8.0,8.0) (8.0,8.0)
  (8.0,8.0) (4.0,4.0) (4.0,4.0) (4.0,4.0) (4.0,4.0) (4.0,4.0) (4.0,4.0)
  7 (1.0,1.0) (1.0,1.0) (1.0,1.0) (1.0,1.0) (1.0,1.0) (1.0,1.0) (1.0,1.0) (1.0,1.0)
  (1.0,1.0) (1.0,1.0) (1.0,1.0) (1.0,1.0) (1.0,1.0) (1.0,1.0) (1.0,1.0)
  7 (8.0,8.0) (8.0,8.0) (8.0,8.0) (8.0,8.0) (8.0,8.0) (8.0,8.0) (8.0,8.0) (8.0,8.0)
  (8.0,8.0) (4.0,4.0) (4.0,4.0) (4.0,4.0) (4.0,4.0) (4.0,4.0) (4.0,4.0)
  8 (1.0,1.0) (1.0,1.0) (1.0,1.0) (1.0,1.0) (1.0,1.0) (1.0,1.0) (1.0,1.0) (1.0,1.0)
  (1.0,1.0) (1.0,1.0) (1.0,1.0) (1.0,1.0) (1.0,1.0) (1.0,1.0) (1.0,1.0)
  8 (8.0,8.0) (8.0,8.0) (8.0,8.0) (8.0,8.0) (8.0,8.0) (8.0,8.0) (8.0,8.0) (8.0,8.0)
  (8.0,8.0) (4.0,4.0) (4.0,4.0) (4.0,4.0) (4.0,4.0) (4.0,4.0) (4.0,4.0)

[and so on, until:]

Fatal error in PMPI_Barrier: Other MPI error

The reduced values should be zeros for the first 5 rows.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
program main 
use mpi 
implicit none 
integer,parameter :: wp = kind(1.d0)
integer rc 
integer rank, worldsize 
integer i,j,n,m,itemp 
complex(kind=wp),dimension(:,:),allocatable :: x,y,z

call mpi_init(rc) 
call mpi_comm_rank(MPI_COMM_WORLD,rank,rc) 
call mpi_comm_size(MPI_COMM_WORLD,worldsize,rc)

m =   450
n =  8000 
allocate(x(m,n),y(m,n),z(m,n))
x=0.0_wp*x
y=0.0_wp*y
z=0.0_wp*z

do i = 6, 25
    do j = 1, n
       x(i,j) = (1.0_wp,1.0_wp)
    end do
end do

itemp=n*m

call MPI_REDUCE(x,y,itemp,MPI_DOUBLE_COMPLEX,MPI_SUM,0,MPI_COMM_WORLD,rc) 
!call mpi_allreduce(x,z,n,MPI_DOUBLE_COMPLEX,MPI_SUM,MPI_COMM_WORLD,rc)

if (rank .eq. 0) then
     print*,'i, x(i), y(i), z(i)'
     do i = 1, 30
        print *, i, (x(i,j), j=1,15)
        print *, i, (y(i,j), j=1,15)
!       print *, i, (z(i,j), j=1,15)
     end do
     print *,'the right answer is ',worldsize*(1.0,1.0) 
end if

call MPI_BARRIER(MPI_COMM_WORLD,rc) 
call MPI_FINALIZE(rc) 
stop
end program main 
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%



More information about the mpich-discuss mailing list