[mpich-discuss] MPI_Scatter

Rajeev Thakur thakur at mcs.anl.gov
Thu Apr 29 09:27:39 CDT 2010


One issue I see in the code you sent yesterday is that the arrays are
declared as real, whereas MPI_Scatter uses MPI_REAL8. Make sure you are
using the right compiler flags to promote reals to 8 bytes. It would be
safer to build MPICH2 with the same compiler flags.

Rajeev


> -----Original Message-----
> From: mpich-discuss-bounces at mcs.anl.gov 
> [mailto:mpich-discuss-bounces at mcs.anl.gov] On Behalf Of 
> Steenhauer, Kate
> Sent: Thursday, April 29, 2010 4:48 AM
> To: Anthony Chan
> Cc: mpich-discuss at mcs.anl.gov
> Subject: Re: [mpich-discuss] MPI_Scatter
> 
> 
> Hi,
> I still get the following error message and I do not know 
> what  the solution could be?
> The scatter routine uses distinct arrays in sending and receiving,
> 
> mpiexec -np 8 /home/eng923/LES2/job/TRY
> Assertion failed in file segment_ops.c at line 49: 0 memcpy 
> argument memory ranges overlap, dst_=0x2b95c65c0300 
> src_=0x2b95c65bfa10 len_=32768
> 
> internal ABORT - process 0
>         3200          80           2
>         3202          82           4
>         3202          82           4
>  ierr           1
>  nx,ny,nz for R        3202          82           4
> rank 0 in job 19  cops-021026_40597   caused collective abort 
> of all ranks
>   exit status of rank 0: return code 1
> 
> 
> Further  I don't understand why mpich2 would suddenly have a 
> problem with allocated space of sendbuf, when mpich1 can deal 
> with this and produces sensible output?
> 
> subroutine scatter(A, B)
>     use messenger
>     include "mpif.inc"
>     real A(nx,ny,nz), B(nx_1,ny_1,nz_1)
>     real, allocatable :: sendbuf(:)
>     integer i1(nproc), i2(nproc), &
>             j1(nproc), j2(nproc), &
>             k1(nproc), k2(nproc)
> 
>     ! Scatter an array among the processors
>     ! including overlapping borders
> 
>     if (myid == idroot) then
>         do ip=1,nproc
>             i1(ip) = ibmino_1(icoord(1,ip))
>             j1(ip) = jbmino_1(icoord(2,ip))
>             k1(ip) = kbmino_1(icoord(3,ip))
>         end do
>         i2 = i1 + nx_1 - 1
>         j2 = j1 + ny_1 - 1
>         k2 = k1 + nz_1 - 1
> 
>         allocate (sendbuf(nproc*nxyz_1))
>         L = 0
>         do ip=1,nproc
>             do k=k1(ip),k2(ip)
>             do j=j1(ip),j2(ip)
>             do i=i1(ip),i2(ip)
>                 L = L + 1
>                 sendbuf(L) = A(i,j,k)
>             end do
>             end do
>             end do
>         end do
>     end if
> 
> Can anyone tell me anything else that I should be checking, 
> again bare in mind that this is a well established code used 
> with mpich1, meaning the programming of my messenger file 
> should really  be in order?
> thanks kate
> ________________________________________
> From: chan at mcs.anl.gov [chan at mcs.anl.gov]
> Sent: 28 April 2010 16:23
> To: Steenhauer, Kate
> Cc: mpich-discuss at mcs.anl.gov
> Subject: Re: [mpich-discuss] MPI_IN_PLACE argument
> 
> ----- "Kate Steenhauer" <k.steenhauer at abdn.ac.uk> wrote:
> 
> > As the include "mpif.inc" statement is repeated so many times, I 
> > copied the mpif.h file from the include mpich2 libraries in 
> the user's 
> > location and renamed it as "mpif.inc". This then gets passed the 
> > MPI_INitialisation problem.
> 
> Most editors, e.g. like vi, should support search/replace 
> operation so you can replace mpif.inc by mpif.h.
> 
> Just to be safe, you can run the following script to replace 
> all mpif.inc by mpif.h in all .f90 file in the working directory
> 
> **************************
> #/bin/sh
> 
> for file in *.f90 ; do
>     echo "Replacing mpif.inc in $file by mpif.h"
>     mv ${file} ${file}.old
>     sed -e 's|mpif\.inc|mpif\.h|g' ${file}.old > ${file} done
> **************************
> 
> The script will save the original copy of your .f90 with an 
> extra suffix .old.
> You could do a "diff ${file}.old ${file}" if you worry the 
> script is doing the right thing...
> 
> 
> >
> > The compiler is an Intel fortran compiler version 11.1. I did not 
> > configure it myself. But I can do this again myself. I imagine the 
> > installation procedure was the most straightforward, nothing 
> > specifically specified. kate
> 
> I used intel 11 fortran compiler to compile your 
> messenger.mpi8.f90, I don't see any error message about "IF-clause"...
> 
> A.Chan
> 
> >
> > -----Original Message-----
> > From: mpich-discuss-bounces at mcs.anl.gov 
> > [mailto:mpich-discuss-bounces at mcs.anl.gov] On Behalf Of Anthony Chan
> > Sent: 28 April 2010 15:41
> > To: mpich-discuss at mcs.anl.gov
> > Subject: Re: [mpich-discuss] MPI_IN_PLACE argument
> >
> >
> > Your enclosed messenger.mpi8.f90 still contains the invalid 
> MPI header 
> > file
> >
> > include "mpif.inc"
> >
> > should be
> >
> > include "mpif.h"
> >
> > Otherwise the compiler will locate the mpif.inc that you 
> have in your 
> > source.
> > "CPPFLAGS=-DNDEBUG" should be used as part of configure command.
> > How do you configure mpich2 ?  What f90 compiler you are using ?
> >
> > A.Chan
> >
> > ----- "Kate Steenhauer" <k.steenhauer at abdn.ac.uk> wrote:
> >
> > > For a quick start I would like to disable the error 
> message first, 
> > > does that mean have to reinstall mpich2, then when I configure it 
> > > ./configure I don't understand where I have to or how to disable
> > the
> > > bug using "CPPFLAGS=-DNDEBUG"?
> > >
> > > Further trying to resolve the problem as well with my 
> very limited 
> > > knowledge I tried what you suggested, only updating the scatter 
> > > routine, which gives me the following compiler error (I have
> > attached
> > > the messenger file):
> > > /opt/mpich2/bin/mpif90  -c messenger.mpi8.f90 -o
> > obj/messenger.mpi8.o
> > > In file messenger.mpi8.f90:238
> > >
> > > if (rank == idroot)
> > >                   1
> > > Error: Unclassifiable statement in IF-clause at (1)  In file
> > > messenger.mpi8.f90:242
> > >
> > > else
> > >    1
> > > Error: Unexpected ELSE statement at (1)  In file 
> > > messenger.mpi8.f90:252
> > >
> > >     if (myid == idroot) deallocate (sendbuf)
> > >                                           1
> > > Error: Expression in DEALLOCATE statement at (1) must be
> > ALLOCATABLE
> > > or a POINTER
> > > make[2]: *** [messenger.mpi8.o] Error 1
> > > make[2]: Leaving directory `/home/eng923/LES/jet/usr'
> > > make[1]: *** [options] Error 2
> > > make[1]: Leaving directory `/home/eng923/LES/jet/usr'
> > > make: *** [opt] Error 2
> > >
> > > The entire subroutine is as follows (see also attached file); Do I
> > not
> > > need to change 'call updateBorder' then as well?
> > >
> > > subroutine scatter(A, B)
> > >     use messenger
> > >     include "mpif.inc"
> > >     real A(nx,ny,nz), B(nx_1,ny_1,nz_1)
> > >     real, allocatable :: sendbuf(:)
> > >     integer i1(nproc), i2(nproc), &
> > >             j1(nproc), j2(nproc), &
> > >             k1(nproc), k2(nproc)
> > >
> > >     ! Scatter an array among the processors
> > >     ! including overlapping borders
> > >
> > >     if (myid == idroot) then
> > >         do ip=1,nproc
> > >             i1(ip) = ibmino_1(icoord(1,ip))
> > >             j1(ip) = jbmino_1(icoord(2,ip))
> > >             k1(ip) = kbmino_1(icoord(3,ip))
> > >         end do
> > >         i2 = i1 + nx_1 - 1
> > >         j2 = j1 + ny_1 - 1
> > >         k2 = k1 + nz_1 - 1
> > >
> > >         allocate (sendbuf(nproc*nxyz_1))
> > >         L = 0
> > >         do ip=1,nproc
> > >             do k=k1(ip),k2(ip)
> > >             do j=j1(ip),j2(ip)
> > >             do i=i1(ip),i2(ip)
> > >                 L = L + 1
> > >                 sendbuf(L) = A(i,j,k)
> > >             end do
> > >             end do
> > >             end do
> > >         end do
> > >     end if
> > >
> > >     call MPI_scatter (sendbuf, nxyz_1, MPI_REAL8, &
> > >                       B, nxyz_1, MPI_REAL8, &
> > >                       idroot, icomm_grid, ierr)
> > >
> > >     call updateBorder(B, nx_1,ny_1,nz_1, 1, 1)
> > >     call updateBorder(B, nx_1,ny_1,nz_1, 2, 2)
> > >     call updateBorder(B, nx_1,ny_1,nz_1, 3, 3)
> > >
> > >     if (myid == idroot) deallocate (sendbuf)
> > >
> > >     return
> > > end
> > >
> > > Thanks
> > > Kate
> > >
> > >
> > >
> > > -----Original Message-----
> > > From: mpich-discuss-bounces at mcs.anl.gov 
> > > [mailto:mpich-discuss-bounces at mcs.anl.gov] On Behalf Of Dave
> > Goodell
> > > Sent: 28 April 2010 14:42
> > > To: mpich-discuss at mcs.anl.gov
> > > Subject: Re: [mpich-discuss] MPI_IN_PLACE argument
> > >
> > > Please see the following mpich-discuss@ threads:
> > >
> > >
> > 
> https://lists.mcs.anl.gov/mailman/htdig/mpich-discuss/2010-April/00697
> > 4.html
> > >
> > 
> https://lists.mcs.anl.gov/mailman/htdig/mpich-discuss/2010-March/00665
> > 8.html
> > >
> > > Basically, in a pinch you can disable this error check by passing 
> > > "CPPFLAGS=-DNDEBUG" to configure.  But your code is still invalid
> > MPI
> > > code and has some chance of erroneous behavior.  The same buffer
> > will
> > > still be passed for both arguments of memcpy, which 
> memcpy does not 
> > > permit and may result in undefined behavior.
> > >
> > > If you want to convert your code to use MPI_IN_PLACE, you 
> will need 
> > > to replace either the sendbuf or the recvbuf with 
> MPI_IN_PLACE, but
> > > *only* at the root process.
> > >
> > > So your MPI_Scatter becomes something like:
> > >
> > > if (rank == idroot)
> > >      MPI_Scatter(sendbuf, nxyz_1, MPI_REAL8, 
> MPI_IN_PLACE, nxyz_1, 
> > > MPI_REAL8, idroot, icomm_grid, ierr); else
> > >      MPI_Scatter(sendbuf, nxyz_1, MPI_REAL8, B, nxyz_1, 
> MPI_REAL8, 
> > > idroot, icomm_grid, ierr);
> > >
> > >
> > > For scatter/scatterv, MPI_IN_PLACE should be passed as 
> the recvbuf.
> > > For gather and most other collectives, MPI_IN_PLACE should passed
> > as
> > > the sendbuf.
> > >
> > > The MPI standard provides more information about MPI_IN_PLACE:
> > > http://www.mpi-forum.org/docs/mpi-2.2/mpi22-report.pdf
> > >
> > > -Dave
> > >
> > > On Apr 28, 2010, at 5:09 AM, Steenhauer, Kate wrote:
> > >
> > > > Hi,
> > > >
> > > > I am trying to sort the bug in the mpi program (see 
> attached). It 
> > > > seems to me, with my very limited knowledge, that everything is 
> > > > related and the whole program needs to be restructed when
> > upgrading
> > > > from mpich1 to mpich2? If the MPI_Scatter routine flags a bug
> > (The
> > > > error message is 'memcpy argument memory ranges overlap,
> > > > dst_=0xafd74f8 src_=0xafd750c len_=16200, internal ABORT'), then
> > it
> > > > is, as far as I can see, most likely that the other routines,
> > such
> > > > as, gather(A, B), allGather(A, B), scatterXY(A, B,
> > nk),gatherXY(A,
> > > > B, nk),allGatherXY(A, B, nk) etc. (see attached), will 
> bring up a 
> > > > similar bug as well. So I don't really know where to start, 
> > > > considering this is a well-established code, run for many years 
> > > > successfully with mpich1 with sensible output.
> > > >
> > > > By using the MPI_IN_Place argument I have to replace 
> the recvbuf, 
> > > > B(nx_1,ny_1,nz_1)?
> > > >    call MPI_scatter (sendbuf, nxyz_1, MPI_REAL8, &
> > > >                      B, nxyz_1, MPI_REAL8, &
> > > >                      idroot, icomm_grid, ierr) I hope 
> you will be 
> > > > able to help me.
> > > > kate
> > > >
> > > > -----Original Message-----
> > > > From: mpich-discuss-bounces at mcs.anl.gov
> > > [mailto:mpich-discuss-bounces at mcs.anl.gov
> > > > ] On Behalf Of Rajeev Thakur
> > > > Sent: 27 April 2010 17:32
> > > > To: mpich-discuss at mcs.anl.gov; 'Anthony Chan'
> > > > Subject: Re: [mpich-discuss] configuration problem
> > > >
> > > > Kate,
> > > >     You have to use the mpif.h file that comes with the MPI 
> > > > implementation, not from some other MPI implementation. With
> > > MPICH2,
> > > > you
> > > > have to use MPICH2's include file.
> > > >
> > > > The error with MPI_Scatter indicates there is a bug in your MPI 
> > > > program.
> > > > You are using the same buffer as sendbuf and recvbuf, which is
> > not
> > > > allowed in MPI. You can use the MPI_IN_PLACE argument 
> instead, as 
> > > > described in the MPI standard.
> > > >
> > > > I would recommend using MPICH2 instead of trying to get MPICH-1
> > to
> > > > work.
> > > >
> > > > Rajeev
> > > >
> > > >
> > > >> -----Original Message-----
> > > >> From: mpich-discuss-bounces at mcs.anl.gov 
> > > >> [mailto:mpich-discuss-bounces at mcs.anl.gov] On Behalf Of 
> > > >> Steenhauer, Kate
> > > >> Sent: Tuesday, April 27, 2010 9:50 AM
> > > >> To: Anthony Chan; mpich-discuss at mcs.anl.gov
> > > >> Subject: Re: [mpich-discuss] configuration problem
> > > >>
> > > >> Initially, I tried to run my code with mpich2. The following
> > > happens.
> > > >>
> > > >> We have 1 workstation with 8 processors, the following
> > > >> software: Linux Centos, Fortran Intel 11.1.
> > > >>
> > > >> Mpich2, Intel Fortran and Linux is not working on our 
> pc. It does 
> > > >> not like the mpif.inc file, it immediately gets into 
> problems at 
> > > >> the initialisation of the MPI (subroutine MPI_INI).
> > > >> mpirun -np 8 RUN02
> > > >>> Fatal error in MPI_Comm_size: Invalid communicator, error
> > stack:
> > > >>> MPI_Comm_size(111): MPI_Comm_size(comm=0x5b,
> > size=0x7fffce629784)
> > > >>> failed
> > > >>> MPI_Comm_size(69).: Invalid communicator MPISTART
> > > >>> rank 7 in job 2  cops-021026_40378   caused collective
> > > >> abort of all ranks
> > > >>> exit status of rank 7: killed by signal 9
> > > >>
> > > >> Then when I change the mpif.inc file (see attachment) 
> and direct 
> > > >> it to the mpif.h file that came with the mpich2 
> library, it gets 
> > > >> passed this problem but then runs into a next problem further 
> > > >> down the line at MPI_SCATTER, where it is trying to distribute 
> > > >> data to the different processors. The error message is 'memcpy 
> > > >> argument memory ranges overlap,
> > > >> dst_=0xafd74f8 src_=0xafd750c len_=16200, internal ABORT'.
> > > >>
> > > >> There is something in the parametresiation within the 
> mpi that is 
> > > >> possibly different then when the code is successfully run on 
> > > >> another cluster (this cluster uses Redhat, various mpich 
> > > >> versions, e.g. mpich-1.2.5..12, and various versions of Intel 
> > > >> FORTRAN compiler (ifort), e.g. 7, 9 and 12).
> > > >>
> > > >> I have attached the mpif.inc file.
> > > >>
> > > >> Please let me know if you have any ideas?
> > > >>
> > > >> Thanks
> > > >>
> > > >> Kate
> > > >>
> > > >> -----Original Message-----
> > > >> From: mpich-discuss-bounces at mcs.anl.gov 
> > > >> [mailto:mpich-discuss-bounces at mcs.anl.gov] On Behalf Of 
> > > >> chan at mcs.anl.gov
> > > >> Sent: 27 April 2010 15:42
> > > >> To: mpich-discuss at mcs.anl.gov
> > > >> Subject: Re: [mpich-discuss] configuration problem
> > > >>
> > > >>
> > > >> Is there any reason you can't use mpich2 ?  The latest stable 
> > > >> release of mpich2 is 1.2.1p1.
> > > >>
> > > >> mpich-1 is no longer officially supported.  Latest fortran
> > > compilers
> > > >> are much better supported in mpich2.
> > > >>
> > > >> A.Chan
> > > >>
> > > >> ----- "Kate Steenhauer" <k.steenhauer at abdn.ac.uk> wrote:
> > > >>
> > > >>> Hello,
> > > >>>
> > > >>> We are trying to install mpich-1.2.7p1. I downloaded and
> > > >> unzipped this
> > > >>> version from
> > http://www.mcs.anl.gov/research/projects/mpi/mpich1/
> > > >>>
> > > >>> We have 1 workstation with 8 processors, and the following
> > > software:
> > > >>> Linux Centos and Fortran Intel 11.1.
> > > >>>
> > > >>> When the documentation guidelines with regard to configuring
> > and
> > > >>> making are followed (see the files attached) it all seems ok,
> > > e.g.
> > > >>> mpif90 is generated. However, when a simple parallel job is
> > > >> tested we
> > > >>> get the following error:
> > > >>>
> > > >>> mpif90 -o testA MPITEST.f90
> > > >>> No Fortran 90 compiler specified when mpif90 was created, or 
> > > >>> configuration file does not specify a compiler.
> > > >>>
> > > >>> Is there a specific prefix I need to give with an 
> ifort fortran 
> > > >>> compiler when I configure mpich?
> > > >>>
> > > >>> I would like to thank you in advance for your help. Please
> > > >> let me know
> > > >>> if you need any further details.
> > > >>>
> > > >>> Regards
> > > >>>
> > > >>> Kate Steenhauer
> > > >>>
> > > >>> University of Aberdeen
> > > >>>
> > > >>> 01224-272806
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>> The University of Aberdeen is a charity registered in 
> Scotland,
> > > No
> > > >>> SC013683.
> > > >>>
> > > >>> _______________________________________________
> > > >>> mpich-discuss mailing list
> > > >>> mpich-discuss at mcs.anl.gov
> > > >>> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
> > > >> _______________________________________________
> > > >> mpich-discuss mailing list
> > > >> mpich-discuss at mcs.anl.gov
> > > >> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
> > > >>
> > > >>
> > > >> The University of Aberdeen is a charity registered in 
> Scotland, 
> > > >> No SC013683.
> > > >>
> > > >
> > > > _______________________________________________
> > > > mpich-discuss mailing list
> > > > mpich-discuss at mcs.anl.gov
> > > > https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
> > > >
> > > >
> > > > The University of Aberdeen is a charity registered in Scotland,
> > No
> > > > SC013683.
> > > >
> > <messenger.mpi8.f90>_______________________________________________
> > > > mpich-discuss mailing list
> > > > mpich-discuss at mcs.anl.gov
> > > > https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
> > >
> > > _______________________________________________
> > > mpich-discuss mailing list
> > > mpich-discuss at mcs.anl.gov
> > > https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
> > >
> > >
> > > The University of Aberdeen is a charity registered in 
> Scotland, No 
> > > SC013683.
> > >
> > > _______________________________________________
> > > mpich-discuss mailing list
> > > mpich-discuss at mcs.anl.gov
> > > https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
> > _______________________________________________
> > mpich-discuss mailing list
> > mpich-discuss at mcs.anl.gov
> > https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
> >
> >
> > The University of Aberdeen is a charity registered in Scotland, No 
> > SC013683.
> 
> 
> The University of Aberdeen is a charity registered in 
> Scotland, No SC013683.
> _______________________________________________
> mpich-discuss mailing list
> mpich-discuss at mcs.anl.gov
> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
> 



More information about the mpich-discuss mailing list