[mpich-discuss] Problem sometimes when running on winxp on >=2 processes and MPE_IBCAST

Ben Tay zonexo at gmail.com
Wed May 7 19:45:27 CDT 2008


Hi,

I've removed the MPI_Barrier. The bound checking is also enabled. 
However the same error still happens, randomly when processes =2 and 
always when processes =4. I did not encounter this error when I run it 
in my school's servers.

I also just compile using MPICH. Interestingly, there is no problem at 
all. So I guess this problem is due to MPICH2.

Thank you very much.

Jayesh Krishna wrote:
>
>  Hi,
>   Please find my observations below,
>
> 1) As Anthony pointed out you don't have to call MPI_Barrier() in a 
> loop for all processes (see usage of MPI collectives).
> 2) When running the program with more than 4 procs, some array 
> accesses are out of bounds (Try re-compiling your program with Run 
> time checking for "Array and String bounds" --> If you are using VS 
> check out "Configuration Properties" --> Fortran --> Runtime --> * for 
> setting the runtime checking)
>
> Regards,
> Jayesh
>
> -----Original Message-----
> From: owner-mpich-discuss at mcs.anl.gov 
> [mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of Anthony Chan
> Sent: Wednesday, May 07, 2008 11:13 AM
> To: mpich-discuss at mcs.anl.gov
> Subject: Re: [mpich-discuss] Problem sometimes when running on winxp 
> on >=2 processes and MPE_IBCAST
>
>
> May not be related to the error that you saw.  You shouldn't call 
> MPI_Barrier and MPI_Bcast with a do loop over processes.
>
> A.Chan
> ----- "Ben Tay" <zonexo at gmail.com> wrote:
>
> > Hi Rajeev,
> >
> > I've attached the code. Thank you very much.
> >
> > Regards.
> >
> > Rajeev Thakur wrote:
> > > Can you send us the code?
> > >
> > > MPE_IBCAST is not a part of the MPI standard. There is no equivalent
> > for it
> > > in MPICH2. You could spawn a thread that calls MPI_Bcast though
> > (after
> > > following all the caveats of MPI and threads as defined in the
> > standard).
> > >
> > > Rajeev
> > >
> > >  
> > >> -----Original Message-----
> > >> From: owner-mpich-discuss at mcs.anl.gov
> > >> [mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of Ben Tay
> > >> Sent: Wednesday, May 07, 2008 10:25 AM
> > >> To: mpich-discuss at mcs.anl.gov
> > >> Subject: [mpich-discuss] Problem sometimes when running on winxp on
> > >> >=2 processes and MPE_IBCAST
> > >>
> > >> Hi,
> > >>
> > >> I tried to run a mpi code which is copied from an example by the RS
> > >> 6000 book. It is supposed to broadcast and synchronize all values.
> > >> When I ran it on my school's linux servers, there is no problem.
> > >> However, if I run it on my own winxp, on >=2 processes, sometimes
> > >> it work, other times I get the error:
> > >>
> > >> [01:3216].....ERROR:result command received but the wait_list is
> > >> empty.
> > >> [01:3216]...ERROR:unable to handle the command: "cmd=result
> > >> src=1 dest=1
> > >> tag=7 c
> > >> md_tag=3 cmd_orig=dbget ctx_key=1 value="port=1518
> > >> description=gotchama-16e5ed i
> > >> fname=192.168.1.105 " result=DBS_SUCCESS "
> > >> [01:3216].ERROR:error closing the unknown context socket:
> > >> generic socket failure , error stack:
> > >> MPIDU_Sock_wait(2603): The I/O operation has been aborted because
> > >> of either a th read exit or an application request.
> > >> (errno 995) [01:3216]..ERROR:sock_op_close returned while unknown
> > >> context is in
> > >> state: SMPD_
> > >> IDLE
> > >>
> > >> Or
> > >>
> > >> [01:3308].....ERROR:result command received but the wait_list is
> > >> empty.
> > >> [01:3308]...ERROR:unable to handle the command: "cmd=result
> > >> src=1 dest=1
> > >> tag=15
> > >> cmd_tag=5 cmd_orig=barrier ctx_key=0 result=DBS_SUCCESS "
> > >> [01:3308]..ERROR:sock_op_close returned while unknown context is
> > in
> > >> state: SMPD_
> > >> IDLE
> > >>
> > >> There is no problem if I run on 1 process. If it's >=4, then the
> > >> error happens all the time. Moreover, it's a rather simple code and
> > >> so there shouldn't be anything wrong with it.
> > >> Why is this so?
> > >>
> > >> Btw, the RS 6000 book also mention a routine called MPE_IBCAST,
> > >> which is a non-blocking version of MPI_BCAST. Is there a similar
> > >> routine in MPICH2?
> > >>
> > >> Thank you very much
> > >>
> > >> Regards.
> > >>
> > >>
> > >>
> > >>
> > >>
> > >>    
> > >
> > >
> > >  
> >
> >
> > program mpi_test2
> >
> > !     test to show updating for i,j double loop (partial continuous 
> data)
> > for specific req data only
> >
> > !     ie update u(2:6,2:6) values instead of all u values, also for 
> struct
> > data
> >
> > !     FVM use
> >
> > implicit none
> >
> > include "mpif.h"     
> >
> > integer, parameter :: size_x=8,size_y=8
> >
> > integer :: i,j,k,ierr,rank,nprocs,u(size_x,size_y)
> >
> > integer :: jsta,jend,jsta2,jend1,inext,iprev,isend1,irecv1,isend2
> >
> > integer :: irecv2,is,ie,js,je
> >
> > integer, allocatable :: jjsta(:), jjlen(:),jjreq(:),u_tmp(:,:)
> >
> > INTEGER istatus(MPI_STATUS_SIZE)
> >
> >
> >
> > call MPI_Init(ierr)
> >
> > call MPI_Comm_rank(MPI_COMM_WORLD,rank,ierr)
> >  
> > call MPI_Comm_size(MPI_COMM_WORLD,nprocs,ierr)
> >
> > allocate (jjsta(0:nprocs-1),jjlen(0:nprocs-1),jjreq(0:nprocs-1))
> >
> > is=3; ie=6;   js=3;   je=6
> >
> > allocate (u_tmp(is:ie,js:je))
> >
> >
> >
> > do k = 0, nprocs - 1
> >
> >       call para_range(js,je, nprocs, k, jsta, jend)
> >
> >       jjsta(k) = jsta
> >      
> >       jjlen(k) = (ie-is+1) * (jend - jsta + 1)
> >
> > end do
> >
> > call para_range(js, je, nprocs, rank , jsta, jend)
> >
> > do j=jsta,jend
> >
> >       do i=is,ie
> >
> >               u(i,j)=(j-1)*size_x+i
> >
> >              
> >
> >       end do
> >
> > end do
> >
> > do j=jsta,jend
> >
> >       do i=is,ie
> >
> >               u_tmp(i,j)=u(i,j)
> >
> >              
> >
> >       end do
> >
> > end do
> >
> > do k=0,nprocs-1
> >
> >       call MPI_Barrier(MPI_COMM_WORLD,ierr)
> >
> >       if (k==rank) then
> >
> >               print *, rank
> >
> >               write (*,'(8i5)') u
> >
> >              
> >
> >       end if
> >
> > end do
> >
> > do k = 0, nprocs - 1
> >
> >      
> >
> >       call MPI_BCAST(u_tmp(is,jjsta(k)), jjlen(k), MPI_Integer,k,
> > MPI_COMM_WORLD, ierr)
> >
> > end do
> >
> >
> >
> >
> > deallocate (jjsta, jjlen, jjreq)
> >
> > u(is:ie,js:je)=u_tmp(is:ie,js:je)
> >
> >
> >
> > do k=0,nprocs-1
> >
> >       call MPI_Barrier(MPI_COMM_WORLD,ierr)
> >
> >       if (k==rank) then
> >
> >               print *, rank
> >
> >               write (*,'(8i5)') u
> >
> >              
> >
> >       end if
> >
> > end do
> >
> >
> >
> >
> > call MPI_Finalize(ierr)
> >
> > contains
> >
> > subroutine para_range(n1, n2, nprocs, irank, ista, iend)
> > !     block distribution
> >
> > integer n1 !The lowest value of the iteration variable (IN)
> >
> > integer n2 !The highest value of the iteration variable (IN)
> >
> > integer nprocs !The number of processes (IN)
> >
> > integer irank !The rank for which you want to know the range of
> > iterations(IN)
> >
> > integer ista !The lowest value of the iteration variable that process
> > irank executes (OUT)
> >
> > integer iend !The highest value of the iteration variable that process
> > irank executes (OUT)
> >
> > integer iwork1,iwork2
> >
> > iwork1 = (n2 - n1 + 1) / nprocs
> >
> > iwork2 = mod(n2 - n1 + 1, nprocs)
> >
> > ista = irank * iwork1 + n1 + min(irank, iwork2)
> >
> > iend = ista + iwork1 - 1
> >
> > if (iwork2 > irank) iend = iend + 1
> >
> > end subroutine para_range
> >
> > end program mpi_test2
>
>




More information about the mpich-discuss mailing list