[mpich-discuss] MPI_Allreduce fail. Please help.

Rajeev Thakur thakur at mcs.anl.gov
Tue Sep 4 15:27:45 CDT 2012


Your program works at least on my Mac laptop.

Rajeev

On Sep 4, 2012, at 2:12 PM, Yonghui wrote:

> Dear MPICH2 users and developers,
>  
> I am recently learning MPI and reading the source code of a MPI implemented open software. One of the functions which does something strange is MPI_Allreduce.
>  
> I am working on Windows 7 pro 64bit machine with MinGW (gcc version 4.6.2, 32bit) and using MPICH2 1.4.1-p1 32bit version downloaded from the MPICH2 site. The code can be compiled without any problem, but however it failed when running (maybe invalid memory access?). There must be some problem with the windows version MPI_Allreduce since it works fine if I remove that line. And it also works if I make the matrix smaller. I tried it on a Ubuntu machine with same version MPI as well. No problem in Linux.
>  
> To make the question clear, I added MPI_Allreduce into a hello world code. The code is written in F90. I haven't test the c version of it but I think they should be very similar (differed by the function name and the error parameter).
>  
> Here is the command that I used to compile:
> gfortran hello1.f90 -g -o hello.exe -IC:\MPICH2_x86\include -LC:\MPICH2_x86\lib -lfmpich2g
>  
> Here is the source code:
> !------------------code begin----------------------
> program main
>   include 'mpif.h'
>   character * (MPI_MAX_PROCESSOR_NAME) processor_name
>   integer myid, numprocs, namelen, rc,ierr
>   integer, allocatable :: mat1(:, :, :)
>  
>   call MPI_INIT( ierr )
>   call MPI_COMM_RANK( MPI_COMM_WORLD, myid, ierr )
>   call MPI_COMM_SIZE( MPI_COMM_WORLD, numprocs, ierr )
>   call MPI_GET_PROCESSOR_NAME(processor_name, namelen, ierr)
>  
>   allocate(mat1(-36:36, -36:36, -36:36))
>   mat1(:,:,:) = 0
>   call MPI_Bcast( mat1(-36, -36, -36), 389017, MPI_INT, 0, MPI_COMM_WORLD, ierr )
>   call MPI_Allreduce(MPI_IN_PLACE, mat1(-36, -36, -36), 389017, MPI_INTEGER, MPI_BOR, MPI_COMM_WORLD, ierr)
>   print *,"MPI_Allreduce done!!!"
>   print *,"Hello World! Process ", myid, " of ", numprocs, " on ", processor_name
>   call MPI_FINALIZE(rc)
>   end
> !------------------code end----------------------
>  
> When I use gdb (comes with MinGW) to check (gdb hello.exe then backtrace). I got something meaningless (or seems to be for myself):
>  
> Program received signal SIGSEGV, Segmentation fault.
> [Switching to Thread 16316.0x4fd0]
> 0x01c03100 in mpich2nemesis!PMPI_Wtime ()
>    from C:\Windows\system32\mpich2nemesis.dll
> (gdb) backtrace
> #0  0x01c03100 in mpich2nemesis!PMPI_Wtime ()
>    from C:\Windows\system32\mpich2nemesis.dll
> #1  0x0017be00 in ?? ()
> #2  0x00000000 in ?? ()
>  
> Does this actually mean there are something wrong with the windows version MPI library?
> What will be the solution to make it work?
>  
> Thanks.
> _______________________________________________
> mpich-discuss mailing list     mpich-discuss at mcs.anl.gov
> To manage subscription options or unsubscribe:
> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss



More information about the mpich-discuss mailing list