[mpich-discuss] MPI_Scatter

Steenhauer, Kate k.steenhauer at abdn.ac.uk
Thu Apr 29 10:00:52 CDT 2010


Hi,
So is the following configuration the right one to solve this potential problem?
1) F90=ifort (to set F90 variable)
2) ./configure --enable-f90modules --prefix=/opt/mpich2 F90FLAGS=-i4
3) make
4) make install
kate
-----Original Message-----
From: mpich-discuss-bounces at mcs.anl.gov [mailto:mpich-discuss-bounces at mcs.anl.gov] On Behalf Of Rajeev Thakur
Sent: 29 April 2010 15:28
To: mpich-discuss at mcs.anl.gov
Subject: Re: [mpich-discuss] MPI_Scatter

One issue I see in the code you sent yesterday is that the arrays are
declared as real, whereas MPI_Scatter uses MPI_REAL8. Make sure you are
using the right compiler flags to promote reals to 8 bytes. It would be
safer to build MPICH2 with the same compiler flags.

Rajeev


> -----Original Message-----
> From: mpich-discuss-bounces at mcs.anl.gov
> [mailto:mpich-discuss-bounces at mcs.anl.gov] On Behalf Of
> Steenhauer, Kate
> Sent: Thursday, April 29, 2010 4:48 AM
> To: Anthony Chan
> Cc: mpich-discuss at mcs.anl.gov
> Subject: Re: [mpich-discuss] MPI_Scatter
>
>
> Hi,
> I still get the following error message and I do not know
> what  the solution could be?
> The scatter routine uses distinct arrays in sending and receiving,
>
> mpiexec -np 8 /home/eng923/LES2/job/TRY
> Assertion failed in file segment_ops.c at line 49: 0 memcpy
> argument memory ranges overlap, dst_=0x2b95c65c0300
> src_=0x2b95c65bfa10 len_=32768
>
> internal ABORT - process 0
>         3200          80           2
>         3202          82           4
>         3202          82           4
>  ierr           1
>  nx,ny,nz for R        3202          82           4
> rank 0 in job 19  cops-021026_40597   caused collective abort
> of all ranks
>   exit status of rank 0: return code 1
>
>
> Further  I don't understand why mpich2 would suddenly have a
> problem with allocated space of sendbuf, when mpich1 can deal
> with this and produces sensible output?
>
> subroutine scatter(A, B)
>     use messenger
>     include "mpif.inc"
>     real A(nx,ny,nz), B(nx_1,ny_1,nz_1)
>     real, allocatable :: sendbuf(:)
>     integer i1(nproc), i2(nproc), &
>             j1(nproc), j2(nproc), &
>             k1(nproc), k2(nproc)
>
>     ! Scatter an array among the processors
>     ! including overlapping borders
>
>     if (myid == idroot) then
>         do ip=1,nproc
>             i1(ip) = ibmino_1(icoord(1,ip))
>             j1(ip) = jbmino_1(icoord(2,ip))
>             k1(ip) = kbmino_1(icoord(3,ip))
>         end do
>         i2 = i1 + nx_1 - 1
>         j2 = j1 + ny_1 - 1
>         k2 = k1 + nz_1 - 1
>
>         allocate (sendbuf(nproc*nxyz_1))
>         L = 0
>         do ip=1,nproc
>             do k=k1(ip),k2(ip)
>             do j=j1(ip),j2(ip)
>             do i=i1(ip),i2(ip)
>                 L = L + 1
>                 sendbuf(L) = A(i,j,k)
>             end do
>             end do
>             end do
>         end do
>     end if
>
> Can anyone tell me anything else that I should be checking,
> again bare in mind that this is a well established code used
> with mpich1, meaning the programming of my messenger file
> should really  be in order?
> thanks kate
> ________________________________________
> From: chan at mcs.anl.gov [chan at mcs.anl.gov]
> Sent: 28 April 2010 16:23
> To: Steenhauer, Kate
> Cc: mpich-discuss at mcs.anl.gov
> Subject: Re: [mpich-discuss] MPI_IN_PLACE argument
>
> ----- "Kate Steenhauer" <k.steenhauer at abdn.ac.uk> wrote:
>
> > As the include "mpif.inc" statement is repeated so many times, I
> > copied the mpif.h file from the include mpich2 libraries in
> the user's
> > location and renamed it as "mpif.inc". This then gets passed the
> > MPI_INitialisation problem.
>
> Most editors, e.g. like vi, should support search/replace
> operation so you can replace mpif.inc by mpif.h.
>
> Just to be safe, you can run the following script to replace
> all mpif.inc by mpif.h in all .f90 file in the working directory
>
> **************************
> #/bin/sh
>
> for file in *.f90 ; do
>     echo "Replacing mpif.inc in $file by mpif.h"
>     mv ${file} ${file}.old
>     sed -e 's|mpif\.inc|mpif\.h|g' ${file}.old > ${file} done
> **************************
>
> The script will save the original copy of your .f90 with an
> extra suffix .old.
> You could do a "diff ${file}.old ${file}" if you worry the
> script is doing the right thing...
>
>
> >
> > The compiler is an Intel fortran compiler version 11.1. I did not
> > configure it myself. But I can do this again myself. I imagine the
> > installation procedure was the most straightforward, nothing
> > specifically specified. kate
>
> I used intel 11 fortran compiler to compile your
> messenger.mpi8.f90, I don't see any error message about "IF-clause"...
>
> A.Chan
>
> >
> > -----Original Message-----
> > From: mpich-discuss-bounces at mcs.anl.gov
> > [mailto:mpich-discuss-bounces at mcs.anl.gov] On Behalf Of Anthony Chan
> > Sent: 28 April 2010 15:41
> > To: mpich-discuss at mcs.anl.gov
> > Subject: Re: [mpich-discuss] MPI_IN_PLACE argument
> >
> >
> > Your enclosed messenger.mpi8.f90 still contains the invalid
> MPI header
> > file
> >
> > include "mpif.inc"
> >
> > should be
> >
> > include "mpif.h"
> >
> > Otherwise the compiler will locate the mpif.inc that you
> have in your
> > source.
> > "CPPFLAGS=-DNDEBUG" should be used as part of configure command.
> > How do you configure mpich2 ?  What f90 compiler you are using ?
> >
> > A.Chan
> >
> > ----- "Kate Steenhauer" <k.steenhauer at abdn.ac.uk> wrote:
> >
> > > For a quick start I would like to disable the error
> message first,
> > > does that mean have to reinstall mpich2, then when I configure it
> > > ./configure I don't understand where I have to or how to disable
> > the
> > > bug using "CPPFLAGS=-DNDEBUG"?
> > >
> > > Further trying to resolve the problem as well with my
> very limited
> > > knowledge I tried what you suggested, only updating the scatter
> > > routine, which gives me the following compiler error (I have
> > attached
> > > the messenger file):
> > > /opt/mpich2/bin/mpif90  -c messenger.mpi8.f90 -o
> > obj/messenger.mpi8.o
> > > In file messenger.mpi8.f90:238
> > >
> > > if (rank == idroot)
> > >                   1
> > > Error: Unclassifiable statement in IF-clause at (1)  In file
> > > messenger.mpi8.f90:242
> > >
> > > else
> > >    1
> > > Error: Unexpected ELSE statement at (1)  In file
> > > messenger.mpi8.f90:252
> > >
> > >     if (myid == idroot) deallocate (sendbuf)
> > >                                           1
> > > Error: Expression in DEALLOCATE statement at (1) must be
> > ALLOCATABLE
> > > or a POINTER
> > > make[2]: *** [messenger.mpi8.o] Error 1
> > > make[2]: Leaving directory `/home/eng923/LES/jet/usr'
> > > make[1]: *** [options] Error 2
> > > make[1]: Leaving directory `/home/eng923/LES/jet/usr'
> > > make: *** [opt] Error 2
> > >
> > > The entire subroutine is as follows (see also attached file); Do I
> > not
> > > need to change 'call updateBorder' then as well?
> > >
> > > subroutine scatter(A, B)
> > >     use messenger
> > >     include "mpif.inc"
> > >     real A(nx,ny,nz), B(nx_1,ny_1,nz_1)
> > >     real, allocatable :: sendbuf(:)
> > >     integer i1(nproc), i2(nproc), &
> > >             j1(nproc), j2(nproc), &
> > >             k1(nproc), k2(nproc)
> > >
> > >     ! Scatter an array among the processors
> > >     ! including overlapping borders
> > >
> > >     if (myid == idroot) then
> > >         do ip=1,nproc
> > >             i1(ip) = ibmino_1(icoord(1,ip))
> > >             j1(ip) = jbmino_1(icoord(2,ip))
> > >             k1(ip) = kbmino_1(icoord(3,ip))
> > >         end do
> > >         i2 = i1 + nx_1 - 1
> > >         j2 = j1 + ny_1 - 1
> > >         k2 = k1 + nz_1 - 1
> > >
> > >         allocate (sendbuf(nproc*nxyz_1))
> > >         L = 0
> > >         do ip=1,nproc
> > >             do k=k1(ip),k2(ip)
> > >             do j=j1(ip),j2(ip)
> > >             do i=i1(ip),i2(ip)
> > >                 L = L + 1
> > >                 sendbuf(L) = A(i,j,k)
> > >             end do
> > >             end do
> > >             end do
> > >         end do
> > >     end if
> > >
> > >     call MPI_scatter (sendbuf, nxyz_1, MPI_REAL8, &
> > >                       B, nxyz_1, MPI_REAL8, &
> > >                       idroot, icomm_grid, ierr)
> > >
> > >     call updateBorder(B, nx_1,ny_1,nz_1, 1, 1)
> > >     call updateBorder(B, nx_1,ny_1,nz_1, 2, 2)
> > >     call updateBorder(B, nx_1,ny_1,nz_1, 3, 3)
> > >
> > >     if (myid == idroot) deallocate (sendbuf)
> > >
> > >     return
> > > end
> > >
> > > Thanks
> > > Kate
> > >
> > >
> > >
> > > -----Original Message-----
> > > From: mpich-discuss-bounces at mcs.anl.gov
> > > [mailto:mpich-discuss-bounces at mcs.anl.gov] On Behalf Of Dave
> > Goodell
> > > Sent: 28 April 2010 14:42
> > > To: mpich-discuss at mcs.anl.gov
> > > Subject: Re: [mpich-discuss] MPI_IN_PLACE argument
> > >
> > > Please see the following mpich-discuss@ threads:
> > >
> > >
> >
> https://lists.mcs.anl.gov/mailman/htdig/mpich-discuss/2010-April/00697
> > 4.html
> > >
> >
> https://lists.mcs.anl.gov/mailman/htdig/mpich-discuss/2010-March/00665
> > 8.html
> > >
> > > Basically, in a pinch you can disable this error check by passing
> > > "CPPFLAGS=-DNDEBUG" to configure.  But your code is still invalid
> > MPI
> > > code and has some chance of erroneous behavior.  The same buffer
> > will
> > > still be passed for both arguments of memcpy, which
> memcpy does not
> > > permit and may result in undefined behavior.
> > >
> > > If you want to convert your code to use MPI_IN_PLACE, you
> will need
> > > to replace either the sendbuf or the recvbuf with
> MPI_IN_PLACE, but
> > > *only* at the root process.
> > >
> > > So your MPI_Scatter becomes something like:
> > >
> > > if (rank == idroot)
> > >      MPI_Scatter(sendbuf, nxyz_1, MPI_REAL8,
> MPI_IN_PLACE, nxyz_1,
> > > MPI_REAL8, idroot, icomm_grid, ierr); else
> > >      MPI_Scatter(sendbuf, nxyz_1, MPI_REAL8, B, nxyz_1,
> MPI_REAL8,
> > > idroot, icomm_grid, ierr);
> > >
> > >
> > > For scatter/scatterv, MPI_IN_PLACE should be passed as
> the recvbuf.
> > > For gather and most other collectives, MPI_IN_PLACE should passed
> > as
> > > the sendbuf.
> > >
> > > The MPI standard provides more information about MPI_IN_PLACE:
> > > http://www.mpi-forum.org/docs/mpi-2.2/mpi22-report.pdf
> > >
> > > -Dave
> > >
> > > On Apr 28, 2010, at 5:09 AM, Steenhauer, Kate wrote:
> > >
> > > > Hi,
> > > >
> > > > I am trying to sort the bug in the mpi program (see
> attached). It
> > > > seems to me, with my very limited knowledge, that everything is
> > > > related and the whole program needs to be restructed when
> > upgrading
> > > > from mpich1 to mpich2? If the MPI_Scatter routine flags a bug
> > (The
> > > > error message is 'memcpy argument memory ranges overlap,
> > > > dst_=0xafd74f8 src_=0xafd750c len_=16200, internal ABORT'), then
> > it
> > > > is, as far as I can see, most likely that the other routines,
> > such
> > > > as, gather(A, B), allGather(A, B), scatterXY(A, B,
> > nk),gatherXY(A,
> > > > B, nk),allGatherXY(A, B, nk) etc. (see attached), will
> bring up a
> > > > similar bug as well. So I don't really know where to start,
> > > > considering this is a well-established code, run for many years
> > > > successfully with mpich1 with sensible output.
> > > >
> > > > By using the MPI_IN_Place argument I have to replace
> the recvbuf,
> > > > B(nx_1,ny_1,nz_1)?
> > > >    call MPI_scatter (sendbuf, nxyz_1, MPI_REAL8, &
> > > >                      B, nxyz_1, MPI_REAL8, &
> > > >                      idroot, icomm_grid, ierr) I hope
> you will be
> > > > able to help me.
> > > > kate
> > > >
> > > > -----Original Message-----
> > > > From: mpich-discuss-bounces at mcs.anl.gov
> > > [mailto:mpich-discuss-bounces at mcs.anl.gov
> > > > ] On Behalf Of Rajeev Thakur
> > > > Sent: 27 April 2010 17:32
> > > > To: mpich-discuss at mcs.anl.gov; 'Anthony Chan'
> > > > Subject: Re: [mpich-discuss] configuration problem
> > > >
> > > > Kate,
> > > >     You have to use the mpif.h file that comes with the MPI
> > > > implementation, not from some other MPI implementation. With
> > > MPICH2,
> > > > you
> > > > have to use MPICH2's include file.
> > > >
> > > > The error with MPI_Scatter indicates there is a bug in your MPI
> > > > program.
> > > > You are using the same buffer as sendbuf and recvbuf, which is
> > not
> > > > allowed in MPI. You can use the MPI_IN_PLACE argument
> instead, as
> > > > described in the MPI standard.
> > > >
> > > > I would recommend using MPICH2 instead of trying to get MPICH-1
> > to
> > > > work.
> > > >
> > > > Rajeev
> > > >
> > > >
> > > >> -----Original Message-----
> > > >> From: mpich-discuss-bounces at mcs.anl.gov
> > > >> [mailto:mpich-discuss-bounces at mcs.anl.gov] On Behalf Of
> > > >> Steenhauer, Kate
> > > >> Sent: Tuesday, April 27, 2010 9:50 AM
> > > >> To: Anthony Chan; mpich-discuss at mcs.anl.gov
> > > >> Subject: Re: [mpich-discuss] configuration problem
> > > >>
> > > >> Initially, I tried to run my code with mpich2. The following
> > > happens.
> > > >>
> > > >> We have 1 workstation with 8 processors, the following
> > > >> software: Linux Centos, Fortran Intel 11.1.
> > > >>
> > > >> Mpich2, Intel Fortran and Linux is not working on our
> pc. It does
> > > >> not like the mpif.inc file, it immediately gets into
> problems at
> > > >> the initialisation of the MPI (subroutine MPI_INI).
> > > >> mpirun -np 8 RUN02
> > > >>> Fatal error in MPI_Comm_size: Invalid communicator, error
> > stack:
> > > >>> MPI_Comm_size(111): MPI_Comm_size(comm=0x5b,
> > size=0x7fffce629784)
> > > >>> failed
> > > >>> MPI_Comm_size(69).: Invalid communicator MPISTART
> > > >>> rank 7 in job 2  cops-021026_40378   caused collective
> > > >> abort of all ranks
> > > >>> exit status of rank 7: killed by signal 9
> > > >>
> > > >> Then when I change the mpif.inc file (see attachment)
> and direct
> > > >> it to the mpif.h file that came with the mpich2
> library, it gets
> > > >> passed this problem but then runs into a next problem further
> > > >> down the line at MPI_SCATTER, where it is trying to distribute
> > > >> data to the different processors. The error message is 'memcpy
> > > >> argument memory ranges overlap,
> > > >> dst_=0xafd74f8 src_=0xafd750c len_=16200, internal ABORT'.
> > > >>
> > > >> There is something in the parametresiation within the
> mpi that is
> > > >> possibly different then when the code is successfully run on
> > > >> another cluster (this cluster uses Redhat, various mpich
> > > >> versions, e.g. mpich-1.2.5..12, and various versions of Intel
> > > >> FORTRAN compiler (ifort), e.g. 7, 9 and 12).
> > > >>
> > > >> I have attached the mpif.inc file.
> > > >>
> > > >> Please let me know if you have any ideas?
> > > >>
> > > >> Thanks
> > > >>
> > > >> Kate
> > > >>
> > > >> -----Original Message-----
> > > >> From: mpich-discuss-bounces at mcs.anl.gov
> > > >> [mailto:mpich-discuss-bounces at mcs.anl.gov] On Behalf Of
> > > >> chan at mcs.anl.gov
> > > >> Sent: 27 April 2010 15:42
> > > >> To: mpich-discuss at mcs.anl.gov
> > > >> Subject: Re: [mpich-discuss] configuration problem
> > > >>
> > > >>
> > > >> Is there any reason you can't use mpich2 ?  The latest stable
> > > >> release of mpich2 is 1.2.1p1.
> > > >>
> > > >> mpich-1 is no longer officially supported.  Latest fortran
> > > compilers
> > > >> are much better supported in mpich2.
> > > >>
> > > >> A.Chan
> > > >>
> > > >> ----- "Kate Steenhauer" <k.steenhauer at abdn.ac.uk> wrote:
> > > >>
> > > >>> Hello,
> > > >>>
> > > >>> We are trying to install mpich-1.2.7p1. I downloaded and
> > > >> unzipped this
> > > >>> version from
> > http://www.mcs.anl.gov/research/projects/mpi/mpich1/
> > > >>>
> > > >>> We have 1 workstation with 8 processors, and the following
> > > software:
> > > >>> Linux Centos and Fortran Intel 11.1.
> > > >>>
> > > >>> When the documentation guidelines with regard to configuring
> > and
> > > >>> making are followed (see the files attached) it all seems ok,
> > > e.g.
> > > >>> mpif90 is generated. However, when a simple parallel job is
> > > >> tested we
> > > >>> get the following error:
> > > >>>
> > > >>> mpif90 -o testA MPITEST.f90
> > > >>> No Fortran 90 compiler specified when mpif90 was created, or
> > > >>> configuration file does not specify a compiler.
> > > >>>
> > > >>> Is there a specific prefix I need to give with an
> ifort fortran
> > > >>> compiler when I configure mpich?
> > > >>>
> > > >>> I would like to thank you in advance for your help. Please
> > > >> let me know
> > > >>> if you need any further details.
> > > >>>
> > > >>> Regards
> > > >>>
> > > >>> Kate Steenhauer
> > > >>>
> > > >>> University of Aberdeen
> > > >>>
> > > >>> 01224-272806
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>> The University of Aberdeen is a charity registered in
> Scotland,
> > > No
> > > >>> SC013683.
> > > >>>
> > > >>> _______________________________________________
> > > >>> mpich-discuss mailing list
> > > >>> mpich-discuss at mcs.anl.gov
> > > >>> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
> > > >> _______________________________________________
> > > >> mpich-discuss mailing list
> > > >> mpich-discuss at mcs.anl.gov
> > > >> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
> > > >>
> > > >>
> > > >> The University of Aberdeen is a charity registered in
> Scotland,
> > > >> No SC013683.
> > > >>
> > > >
> > > > _______________________________________________
> > > > mpich-discuss mailing list
> > > > mpich-discuss at mcs.anl.gov
> > > > https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
> > > >
> > > >
> > > > The University of Aberdeen is a charity registered in Scotland,
> > No
> > > > SC013683.
> > > >
> > <messenger.mpi8.f90>_______________________________________________
> > > > mpich-discuss mailing list
> > > > mpich-discuss at mcs.anl.gov
> > > > https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
> > >
> > > _______________________________________________
> > > mpich-discuss mailing list
> > > mpich-discuss at mcs.anl.gov
> > > https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
> > >
> > >
> > > The University of Aberdeen is a charity registered in
> Scotland, No
> > > SC013683.
> > >
> > > _______________________________________________
> > > mpich-discuss mailing list
> > > mpich-discuss at mcs.anl.gov
> > > https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
> > _______________________________________________
> > mpich-discuss mailing list
> > mpich-discuss at mcs.anl.gov
> > https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
> >
> >
> > The University of Aberdeen is a charity registered in Scotland, No
> > SC013683.
>
>
> The University of Aberdeen is a charity registered in
> Scotland, No SC013683.
> _______________________________________________
> mpich-discuss mailing list
> mpich-discuss at mcs.anl.gov
> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
>

_______________________________________________
mpich-discuss mailing list
mpich-discuss at mcs.anl.gov
https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss


The University of Aberdeen is a charity registered in Scotland, No SC013683.


More information about the mpich-discuss mailing list