[mpich-discuss] MPI_Scatter

Steenhauer, Kate k.steenhauer at abdn.ac.uk
Thu Apr 29 04:47:51 CDT 2010


Hi,
I still get the following error message and I do not know what  the solution could be?
The scatter routine uses distinct arrays in sending and receiving,

mpiexec -np 8 /home/eng923/LES2/job/TRY
Assertion failed in file segment_ops.c at line 49: 0
memcpy argument memory ranges overlap, dst_=0x2b95c65c0300 src_=0x2b95c65bfa10 len_=32768

internal ABORT - process 0
        3200          80           2
        3202          82           4
        3202          82           4
 ierr           1
 nx,ny,nz for R        3202          82           4
rank 0 in job 19  cops-021026_40597   caused collective abort of all ranks
  exit status of rank 0: return code 1


Further  I don't understand why mpich2 would suddenly have a problem with allocated space of sendbuf, when mpich1 can deal with this and produces sensible output?

subroutine scatter(A, B)
    use messenger
    include "mpif.inc"
    real A(nx,ny,nz), B(nx_1,ny_1,nz_1)
    real, allocatable :: sendbuf(:)
    integer i1(nproc), i2(nproc), &
            j1(nproc), j2(nproc), &
            k1(nproc), k2(nproc)

    ! Scatter an array among the processors
    ! including overlapping borders

    if (myid == idroot) then
        do ip=1,nproc
            i1(ip) = ibmino_1(icoord(1,ip))
            j1(ip) = jbmino_1(icoord(2,ip))
            k1(ip) = kbmino_1(icoord(3,ip))
        end do
        i2 = i1 + nx_1 - 1
        j2 = j1 + ny_1 - 1
        k2 = k1 + nz_1 - 1

        allocate (sendbuf(nproc*nxyz_1))
        L = 0
        do ip=1,nproc
            do k=k1(ip),k2(ip)
            do j=j1(ip),j2(ip)
            do i=i1(ip),i2(ip)
                L = L + 1
                sendbuf(L) = A(i,j,k)
            end do
            end do
            end do
        end do
    end if

Can anyone tell me anything else that I should be checking, again bare in mind that this is a well established code used with mpich1, meaning the programming of my messenger file should really  be in order?
thanks kate
________________________________________
From: chan at mcs.anl.gov [chan at mcs.anl.gov]
Sent: 28 April 2010 16:23
To: Steenhauer, Kate
Cc: mpich-discuss at mcs.anl.gov
Subject: Re: [mpich-discuss] MPI_IN_PLACE argument

----- "Kate Steenhauer" <k.steenhauer at abdn.ac.uk> wrote:

> As the include "mpif.inc" statement is repeated so many times, I
> copied the mpif.h file from the include mpich2 libraries in the user's
> location and renamed it as "mpif.inc". This then gets passed the
> MPI_INitialisation problem.

Most editors, e.g. like vi, should support search/replace operation so
you can replace mpif.inc by mpif.h.

Just to be safe, you can run the following script to replace all mpif.inc
by mpif.h in all .f90 file in the working directory

**************************
#/bin/sh

for file in *.f90 ; do
    echo "Replacing mpif.inc in $file by mpif.h"
    mv ${file} ${file}.old
    sed -e 's|mpif\.inc|mpif\.h|g' ${file}.old > ${file}
done
**************************

The script will save the original copy of your .f90 with an extra suffix .old.
You could do a "diff ${file}.old ${file}" if you worry the script is doing
the right thing...


>
> The compiler is an Intel fortran compiler version 11.1. I did not
> configure it myself. But I can do this again myself. I imagine the
> installation procedure was the most straightforward, nothing
> specifically specified. kate

I used intel 11 fortran compiler to compile your messenger.mpi8.f90,
I don't see any error message about "IF-clause"...

A.Chan

>
> -----Original Message-----
> From: mpich-discuss-bounces at mcs.anl.gov
> [mailto:mpich-discuss-bounces at mcs.anl.gov] On Behalf Of Anthony Chan
> Sent: 28 April 2010 15:41
> To: mpich-discuss at mcs.anl.gov
> Subject: Re: [mpich-discuss] MPI_IN_PLACE argument
>
>
> Your enclosed messenger.mpi8.f90 still contains the invalid MPI header
> file
>
> include "mpif.inc"
>
> should be
>
> include "mpif.h"
>
> Otherwise the compiler will locate the mpif.inc that you have in your
> source.
> "CPPFLAGS=-DNDEBUG" should be used as part of configure command.
> How do you configure mpich2 ?  What f90 compiler you are using ?
>
> A.Chan
>
> ----- "Kate Steenhauer" <k.steenhauer at abdn.ac.uk> wrote:
>
> > For a quick start I would like to disable the error message first,
> > does that mean have to reinstall mpich2, then when I configure it
> > ./configure I don't understand where I have to or how to disable
> the
> > bug using "CPPFLAGS=-DNDEBUG"?
> >
> > Further trying to resolve the problem as well with my very limited
> > knowledge I tried what you suggested, only updating the scatter
> > routine, which gives me the following compiler error (I have
> attached
> > the messenger file):
> > /opt/mpich2/bin/mpif90  -c messenger.mpi8.f90 -o
> obj/messenger.mpi8.o
> > In file messenger.mpi8.f90:238
> >
> > if (rank == idroot)
> >                   1
> > Error: Unclassifiable statement in IF-clause at (1)  In file
> > messenger.mpi8.f90:242
> >
> > else
> >    1
> > Error: Unexpected ELSE statement at (1)
> >  In file messenger.mpi8.f90:252
> >
> >     if (myid == idroot) deallocate (sendbuf)
> >                                           1
> > Error: Expression in DEALLOCATE statement at (1) must be
> ALLOCATABLE
> > or a POINTER
> > make[2]: *** [messenger.mpi8.o] Error 1
> > make[2]: Leaving directory `/home/eng923/LES/jet/usr'
> > make[1]: *** [options] Error 2
> > make[1]: Leaving directory `/home/eng923/LES/jet/usr'
> > make: *** [opt] Error 2
> >
> > The entire subroutine is as follows (see also attached file); Do I
> not
> > need to change 'call updateBorder' then as well?
> >
> > subroutine scatter(A, B)
> >     use messenger
> >     include "mpif.inc"
> >     real A(nx,ny,nz), B(nx_1,ny_1,nz_1)
> >     real, allocatable :: sendbuf(:)
> >     integer i1(nproc), i2(nproc), &
> >             j1(nproc), j2(nproc), &
> >             k1(nproc), k2(nproc)
> >
> >     ! Scatter an array among the processors
> >     ! including overlapping borders
> >
> >     if (myid == idroot) then
> >         do ip=1,nproc
> >             i1(ip) = ibmino_1(icoord(1,ip))
> >             j1(ip) = jbmino_1(icoord(2,ip))
> >             k1(ip) = kbmino_1(icoord(3,ip))
> >         end do
> >         i2 = i1 + nx_1 - 1
> >         j2 = j1 + ny_1 - 1
> >         k2 = k1 + nz_1 - 1
> >
> >         allocate (sendbuf(nproc*nxyz_1))
> >         L = 0
> >         do ip=1,nproc
> >             do k=k1(ip),k2(ip)
> >             do j=j1(ip),j2(ip)
> >             do i=i1(ip),i2(ip)
> >                 L = L + 1
> >                 sendbuf(L) = A(i,j,k)
> >             end do
> >             end do
> >             end do
> >         end do
> >     end if
> >
> >     call MPI_scatter (sendbuf, nxyz_1, MPI_REAL8, &
> >                       B, nxyz_1, MPI_REAL8, &
> >                       idroot, icomm_grid, ierr)
> >
> >     call updateBorder(B, nx_1,ny_1,nz_1, 1, 1)
> >     call updateBorder(B, nx_1,ny_1,nz_1, 2, 2)
> >     call updateBorder(B, nx_1,ny_1,nz_1, 3, 3)
> >
> >     if (myid == idroot) deallocate (sendbuf)
> >
> >     return
> > end
> >
> > Thanks
> > Kate
> >
> >
> >
> > -----Original Message-----
> > From: mpich-discuss-bounces at mcs.anl.gov
> > [mailto:mpich-discuss-bounces at mcs.anl.gov] On Behalf Of Dave
> Goodell
> > Sent: 28 April 2010 14:42
> > To: mpich-discuss at mcs.anl.gov
> > Subject: Re: [mpich-discuss] MPI_IN_PLACE argument
> >
> > Please see the following mpich-discuss@ threads:
> >
> >
> https://lists.mcs.anl.gov/mailman/htdig/mpich-discuss/2010-April/006974.html
> >
> https://lists.mcs.anl.gov/mailman/htdig/mpich-discuss/2010-March/006658.html
> >
> > Basically, in a pinch you can disable this error check by passing
> > "CPPFLAGS=-DNDEBUG" to configure.  But your code is still invalid
> MPI
> > code and has some chance of erroneous behavior.  The same buffer
> will
> > still be passed for both arguments of memcpy, which memcpy does not
> > permit and may result in undefined behavior.
> >
> > If you want to convert your code to use MPI_IN_PLACE, you will need
> > to
> > replace either the sendbuf or the recvbuf with MPI_IN_PLACE, but
> > *only* at the root process.
> >
> > So your MPI_Scatter becomes something like:
> >
> > if (rank == idroot)
> >      MPI_Scatter(sendbuf, nxyz_1, MPI_REAL8, MPI_IN_PLACE, nxyz_1,
> > MPI_REAL8, idroot, icomm_grid, ierr);
> > else
> >      MPI_Scatter(sendbuf, nxyz_1, MPI_REAL8, B, nxyz_1, MPI_REAL8,
> > idroot, icomm_grid, ierr);
> >
> >
> > For scatter/scatterv, MPI_IN_PLACE should be passed as the recvbuf.
> > For gather and most other collectives, MPI_IN_PLACE should passed
> as
> > the sendbuf.
> >
> > The MPI standard provides more information about MPI_IN_PLACE:
> > http://www.mpi-forum.org/docs/mpi-2.2/mpi22-report.pdf
> >
> > -Dave
> >
> > On Apr 28, 2010, at 5:09 AM, Steenhauer, Kate wrote:
> >
> > > Hi,
> > >
> > > I am trying to sort the bug in the mpi program (see attached). It
> > > seems to me, with my very limited knowledge, that everything is
> > > related and the whole program needs to be restructed when
> upgrading
> > > from mpich1 to mpich2? If the MPI_Scatter routine flags a bug
> (The
> > > error message is 'memcpy argument memory ranges overlap,
> > > dst_=0xafd74f8 src_=0xafd750c len_=16200, internal ABORT'), then
> it
> > > is, as far as I can see, most likely that the other routines,
> such
> > > as, gather(A, B), allGather(A, B), scatterXY(A, B,
> nk),gatherXY(A,
> > > B, nk),allGatherXY(A, B, nk) etc. (see attached), will bring up a
> > > similar bug as well. So I don't really know where to start,
> > > considering this is a well-established code, run for many years
> > > successfully with mpich1 with sensible output.
> > >
> > > By using the MPI_IN_Place argument I have to replace the recvbuf,
> > > B(nx_1,ny_1,nz_1)?
> > >    call MPI_scatter (sendbuf, nxyz_1, MPI_REAL8, &
> > >                      B, nxyz_1, MPI_REAL8, &
> > >                      idroot, icomm_grid, ierr)
> > > I hope you will be able to help me.
> > > kate
> > >
> > > -----Original Message-----
> > > From: mpich-discuss-bounces at mcs.anl.gov
> > [mailto:mpich-discuss-bounces at mcs.anl.gov
> > > ] On Behalf Of Rajeev Thakur
> > > Sent: 27 April 2010 17:32
> > > To: mpich-discuss at mcs.anl.gov; 'Anthony Chan'
> > > Subject: Re: [mpich-discuss] configuration problem
> > >
> > > Kate,
> > >     You have to use the mpif.h file that comes with the MPI
> > > implementation, not from some other MPI implementation. With
> > MPICH2,
> > > you
> > > have to use MPICH2's include file.
> > >
> > > The error with MPI_Scatter indicates there is a bug in your MPI
> > > program.
> > > You are using the same buffer as sendbuf and recvbuf, which is
> not
> > > allowed in MPI. You can use the MPI_IN_PLACE argument instead, as
> > > described in the MPI standard.
> > >
> > > I would recommend using MPICH2 instead of trying to get MPICH-1
> to
> > > work.
> > >
> > > Rajeev
> > >
> > >
> > >> -----Original Message-----
> > >> From: mpich-discuss-bounces at mcs.anl.gov
> > >> [mailto:mpich-discuss-bounces at mcs.anl.gov] On Behalf Of
> > >> Steenhauer, Kate
> > >> Sent: Tuesday, April 27, 2010 9:50 AM
> > >> To: Anthony Chan; mpich-discuss at mcs.anl.gov
> > >> Subject: Re: [mpich-discuss] configuration problem
> > >>
> > >> Initially, I tried to run my code with mpich2. The following
> > happens.
> > >>
> > >> We have 1 workstation with 8 processors, the following
> > >> software: Linux Centos, Fortran Intel 11.1.
> > >>
> > >> Mpich2, Intel Fortran and Linux is not working on our pc. It
> > >> does not like the mpif.inc file, it immediately gets into
> > >> problems at the initialisation of the MPI (subroutine MPI_INI).
> > >> mpirun -np 8 RUN02
> > >>> Fatal error in MPI_Comm_size: Invalid communicator, error
> stack:
> > >>> MPI_Comm_size(111): MPI_Comm_size(comm=0x5b,
> size=0x7fffce629784)
> > >>> failed
> > >>> MPI_Comm_size(69).: Invalid communicator MPISTART
> > >>> rank 7 in job 2  cops-021026_40378   caused collective
> > >> abort of all ranks
> > >>> exit status of rank 7: killed by signal 9
> > >>
> > >> Then when I change the mpif.inc file (see attachment) and
> > >> direct it to the mpif.h file that came with the mpich2
> > >> library, it gets passed this problem but then runs into a
> > >> next problem further down the line at MPI_SCATTER, where it
> > >> is trying to distribute data to the different processors. The
> > >> error message is 'memcpy argument memory ranges overlap,
> > >> dst_=0xafd74f8 src_=0xafd750c len_=16200, internal ABORT'.
> > >>
> > >> There is something in the parametresiation within the mpi
> > >> that is possibly different then when the code is successfully
> > >> run on another cluster (this cluster uses Redhat, various
> > >> mpich versions, e.g. mpich-1.2.5..12, and various versions of
> > >> Intel FORTRAN compiler (ifort), e.g. 7, 9 and 12).
> > >>
> > >> I have attached the mpif.inc file.
> > >>
> > >> Please let me know if you have any ideas?
> > >>
> > >> Thanks
> > >>
> > >> Kate
> > >>
> > >> -----Original Message-----
> > >> From: mpich-discuss-bounces at mcs.anl.gov
> > >> [mailto:mpich-discuss-bounces at mcs.anl.gov] On Behalf Of
> > >> chan at mcs.anl.gov
> > >> Sent: 27 April 2010 15:42
> > >> To: mpich-discuss at mcs.anl.gov
> > >> Subject: Re: [mpich-discuss] configuration problem
> > >>
> > >>
> > >> Is there any reason you can't use mpich2 ?  The latest stable
> > >> release of mpich2 is 1.2.1p1.
> > >>
> > >> mpich-1 is no longer officially supported.  Latest fortran
> > compilers
> > >> are much better supported in mpich2.
> > >>
> > >> A.Chan
> > >>
> > >> ----- "Kate Steenhauer" <k.steenhauer at abdn.ac.uk> wrote:
> > >>
> > >>> Hello,
> > >>>
> > >>> We are trying to install mpich-1.2.7p1. I downloaded and
> > >> unzipped this
> > >>> version from
> http://www.mcs.anl.gov/research/projects/mpi/mpich1/
> > >>>
> > >>> We have 1 workstation with 8 processors, and the following
> > software:
> > >>> Linux Centos and Fortran Intel 11.1.
> > >>>
> > >>> When the documentation guidelines with regard to configuring
> and
> > >>> making are followed (see the files attached) it all seems ok,
> > e.g.
> > >>> mpif90 is generated. However, when a simple parallel job is
> > >> tested we
> > >>> get the following error:
> > >>>
> > >>> mpif90 -o testA MPITEST.f90
> > >>> No Fortran 90 compiler specified when mpif90 was created, or
> > >>> configuration file does not specify a compiler.
> > >>>
> > >>> Is there a specific prefix I need to give with an ifort fortran
> > >>> compiler when I configure mpich?
> > >>>
> > >>> I would like to thank you in advance for your help. Please
> > >> let me know
> > >>> if you need any further details.
> > >>>
> > >>> Regards
> > >>>
> > >>> Kate Steenhauer
> > >>>
> > >>> University of Aberdeen
> > >>>
> > >>> 01224-272806
> > >>>
> > >>>
> > >>>
> > >>>
> > >>>
> > >>> The University of Aberdeen is a charity registered in Scotland,
> > No
> > >>> SC013683.
> > >>>
> > >>> _______________________________________________
> > >>> mpich-discuss mailing list
> > >>> mpich-discuss at mcs.anl.gov
> > >>> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
> > >> _______________________________________________
> > >> mpich-discuss mailing list
> > >> mpich-discuss at mcs.anl.gov
> > >> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
> > >>
> > >>
> > >> The University of Aberdeen is a charity registered in
> > >> Scotland, No SC013683.
> > >>
> > >
> > > _______________________________________________
> > > mpich-discuss mailing list
> > > mpich-discuss at mcs.anl.gov
> > > https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
> > >
> > >
> > > The University of Aberdeen is a charity registered in Scotland,
> No
> > > SC013683.
> > >
> <messenger.mpi8.f90>_______________________________________________
> > > mpich-discuss mailing list
> > > mpich-discuss at mcs.anl.gov
> > > https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
> >
> > _______________________________________________
> > mpich-discuss mailing list
> > mpich-discuss at mcs.anl.gov
> > https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
> >
> >
> > The University of Aberdeen is a charity registered in Scotland, No
> > SC013683.
> >
> > _______________________________________________
> > mpich-discuss mailing list
> > mpich-discuss at mcs.anl.gov
> > https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
> _______________________________________________
> mpich-discuss mailing list
> mpich-discuss at mcs.anl.gov
> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
>
>
> The University of Aberdeen is a charity registered in Scotland, No
> SC013683.


The University of Aberdeen is a charity registered in Scotland, No SC013683.


More information about the mpich-discuss mailing list