[mpich-discuss] Problem sometimes when running on winxp on >=2 processes and MPE_IBCAST

Jayesh Krishna jayesh at mcs.anl.gov
Wed May 7 11:24:43 CDT 2008


 Hi,
  Please find my observations below,

1) As Anthony pointed out you don't have to call MPI_Barrier() in a loop
for all processes (see usage of MPI collectives).
2) When running the program with more than 4 procs, some array accesses
are out of bounds (Try re-compiling your program with Run time checking
for "Array and String bounds" --> If you are using VS check out
"Configuration Properties" --> Fortran --> Runtime --> * for setting the
runtime checking)

Regards,
Jayesh

-----Original Message-----
From: owner-mpich-discuss at mcs.anl.gov
[mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of Anthony Chan
Sent: Wednesday, May 07, 2008 11:13 AM
To: mpich-discuss at mcs.anl.gov
Subject: Re: [mpich-discuss] Problem sometimes when running on winxp on
>=2 processes and MPE_IBCAST


May not be related to the error that you saw.  You shouldn't call
MPI_Barrier and MPI_Bcast with a do loop over processes.

A.Chan
----- "Ben Tay" <zonexo at gmail.com> wrote:

> Hi Rajeev,
> 
> I've attached the code. Thank you very much.
> 
> Regards.
> 
> Rajeev Thakur wrote:
> > Can you send us the code? 
> >
> > MPE_IBCAST is not a part of the MPI standard. There is no equivalent
> for it
> > in MPICH2. You could spawn a thread that calls MPI_Bcast though
> (after
> > following all the caveats of MPI and threads as defined in the
> standard). 
> >
> > Rajeev
> >
> >   
> >> -----Original Message-----
> >> From: owner-mpich-discuss at mcs.anl.gov 
> >> [mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of Ben Tay
> >> Sent: Wednesday, May 07, 2008 10:25 AM
> >> To: mpich-discuss at mcs.anl.gov
> >> Subject: [mpich-discuss] Problem sometimes when running on winxp on 
> >> >=2 processes and MPE_IBCAST
> >>
> >> Hi,
> >>
> >> I tried to run a mpi code which is copied from an example by the RS 
> >> 6000 book. It is supposed to broadcast and synchronize all values. 
> >> When I ran it on my school's linux servers, there is no problem. 
> >> However, if I run it on my own winxp, on >=2 processes, sometimes 
> >> it work, other times I get the error:
> >>
> >> [01:3216].....ERROR:result command received but the wait_list is 
> >> empty.
> >> [01:3216]...ERROR:unable to handle the command: "cmd=result
> >> src=1 dest=1
> >> tag=7 c
> >> md_tag=3 cmd_orig=dbget ctx_key=1 value="port=1518 
> >> description=gotchama-16e5ed i
> >> fname=192.168.1.105 " result=DBS_SUCCESS "
> >> [01:3216].ERROR:error closing the unknown context socket: 
> >> generic socket failure , error stack:
> >> MPIDU_Sock_wait(2603): The I/O operation has been aborted because 
> >> of either a th read exit or an application request.
> >> (errno 995) [01:3216]..ERROR:sock_op_close returned while unknown 
> >> context is in
> >> state: SMPD_
> >> IDLE
> >>
> >> Or
> >>
> >> [01:3308].....ERROR:result command received but the wait_list is 
> >> empty.
> >> [01:3308]...ERROR:unable to handle the command: "cmd=result
> >> src=1 dest=1
> >> tag=15
> >> cmd_tag=5 cmd_orig=barrier ctx_key=0 result=DBS_SUCCESS "
> >> [01:3308]..ERROR:sock_op_close returned while unknown context is
> in
> >> state: SMPD_
> >> IDLE
> >>
> >> There is no problem if I run on 1 process. If it's >=4, then the 
> >> error happens all the time. Moreover, it's a rather simple code and 
> >> so there shouldn't be anything wrong with it.
> >> Why is this so?
> >>
> >> Btw, the RS 6000 book also mention a routine called MPE_IBCAST, 
> >> which is a non-blocking version of MPI_BCAST. Is there a similar 
> >> routine in MPICH2?
> >>
> >> Thank you very much
> >>
> >> Regards.
> >>
> >>
> >>
> >>
> >>
> >>     
> >
> >
> >   
> 
> 
> program mpi_test2
> 
> !	test to show updating for i,j double loop (partial continuous
data)
> for specific req data only
> 
> !	ie update u(2:6,2:6) values instead of all u values, also for
struct
> data
> 
> !	FVM use
> 
> implicit none
> 
> include "mpif.h"	
> 
> integer, parameter :: size_x=8,size_y=8
> 
> integer :: i,j,k,ierr,rank,nprocs,u(size_x,size_y)
> 
> integer :: jsta,jend,jsta2,jend1,inext,iprev,isend1,irecv1,isend2
> 
> integer :: irecv2,is,ie,js,je
> 
> integer, allocatable :: jjsta(:), jjlen(:),jjreq(:),u_tmp(:,:)
> 
> INTEGER istatus(MPI_STATUS_SIZE)
> 
> 
> 
> call MPI_Init(ierr)
> 
> call MPI_Comm_rank(MPI_COMM_WORLD,rank,ierr)
>   
> call MPI_Comm_size(MPI_COMM_WORLD,nprocs,ierr)
> 
> allocate (jjsta(0:nprocs-1),jjlen(0:nprocs-1),jjreq(0:nprocs-1))
> 
> is=3;	ie=6;	js=3;	je=6
> 
> allocate (u_tmp(is:ie,js:je))
> 
> 
> 
> do k = 0, nprocs - 1
> 
> 	call para_range(js,je, nprocs, k, jsta, jend)
> 
> 	jjsta(k) = jsta
> 	
> 	jjlen(k) = (ie-is+1) * (jend - jsta + 1)
> 
> end do
> 
> call para_range(js, je, nprocs, rank , jsta, jend)
> 
> do j=jsta,jend
> 
> 	do i=is,ie
> 
> 		u(i,j)=(j-1)*size_x+i
> 
> 		
> 
> 	end do
> 
> end do
> 
> do j=jsta,jend
> 
> 	do i=is,ie
> 
> 		u_tmp(i,j)=u(i,j)
> 
> 		
> 
> 	end do
> 
> end do
> 
> do k=0,nprocs-1
> 
> 	call MPI_Barrier(MPI_COMM_WORLD,ierr)
> 
> 	if (k==rank) then
> 
> 		print *, rank
> 
> 		write (*,'(8i5)') u
> 
> 		
> 
> 	end if
> 
> end do
> 
> do k = 0, nprocs - 1
> 
> 	
> 
> 	call MPI_BCAST(u_tmp(is,jjsta(k)), jjlen(k), MPI_Integer,k, 
> MPI_COMM_WORLD, ierr)
> 
> end do
> 
> 
> 
> 
> deallocate (jjsta, jjlen, jjreq)
> 
> u(is:ie,js:je)=u_tmp(is:ie,js:je)
> 
> 
> 
> do k=0,nprocs-1
> 
> 	call MPI_Barrier(MPI_COMM_WORLD,ierr)
> 
> 	if (k==rank) then
> 
> 		print *, rank
> 
> 		write (*,'(8i5)') u
> 
> 		
> 
> 	end if
> 
> end do
> 
> 
> 
> 
> call MPI_Finalize(ierr)
> 
> contains
> 
> subroutine para_range(n1, n2, nprocs, irank, ista, iend)
> !	block distribution
> 
> integer n1 !The lowest value of the iteration variable (IN)
> 
> integer n2 !The highest value of the iteration variable (IN)
> 
> integer nprocs !The number of processes (IN)
> 
> integer irank !The rank for which you want to know the range of
> iterations(IN)
> 
> integer ista !The lowest value of the iteration variable that process 
> irank executes (OUT)
> 
> integer iend !The highest value of the iteration variable that process 
> irank executes (OUT)
> 
> integer iwork1,iwork2
> 
> iwork1 = (n2 - n1 + 1) / nprocs
> 
> iwork2 = mod(n2 - n1 + 1, nprocs)
> 
> ista = irank * iwork1 + n1 + min(irank, iwork2)
> 
> iend = ista + iwork1 - 1
> 
> if (iwork2 > irank) iend = iend + 1
> 
> end subroutine para_range
> 
> end program mpi_test2


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20080507/db6e96d9/attachment.htm>


More information about the mpich-discuss mailing list