[mpich-discuss] Problem with MPI_GATHER on multiple machines (F90)

Rajeev Thakur thakur at mcs.anl.gov
Mon Nov 21 16:57:24 CST 2011


See if other MPI programs run across multiple machines. For example, the cpi example in the examples directory.


On Nov 21, 2011, at 3:51 PM, Chavez, Andres wrote:

> When restricted to running on one machine, my F90 program works perfectly, but when I try to have it run on multiple machines the problem below occurs.  I can't figure out what is going wrong, any help will be greatly appreciated thank you.
> 
> Fatal error in PMPI_Gather: Other MPI error, error stack:
> PMPI_Gather(863)..................: MPI_Gather(sbuf=0xeb59a0, scount=512, MPI_DOUBLE_COMPLEX, rbuf=(nil), rcount=512, MPI_DOUBLE_COMPLEX, root=0, MPI_COMM_WORLD) failed
> MPIR_Gather_impl(693).............: 
> MPIR_Gather(655)..................: 
> MPIR_Gather_intra(283)............: 
> MPIC_Send(66).....................: 
> MPIC_Wait(540)....................: 
> MPIDI_CH3I_Progress(402)..........: 
> MPID_nem_mpich2_blocking_recv(905): 
> MPID_nem_tcp_connpoll(1838).......: 
> state_listening_handler(1908).....: accept of socket fd failed - Invalid argument
> Fatal error in PMPI_Gather: Other MPI error, error stack:
> PMPI_Gather(863)..........: MPI_Gather(sbuf=0x25d39e0, scount=512, MPI_DOUBLE_COMPLEX, rbuf=0x25bd9b0, rcount=512, MPI_DOUBLE_COMPLEX, root=0, MPI_COMM_WORLD) failed
> MPIR_Gather_impl(693).....: 
> MPIR_Gather(655)..........: 
> MPIR_Gather_intra(202)....: 
> dequeue_and_set_error(596): Communication error with rank 1
> 
> These are all the instances of MPI_GATHER
> call MPI_GATHER(xi_dot_matrix_transp,na*n_elements*nsd/numtasks,MPI_DOUBLE_COMPLEX,xi_dot_matrix_gath,&
>      na*n_elements*nsd/numtasks,MPI_DOUBLE_COMPLEX,0,MPI_COMM_WORLD,ierr)
> call MPI_GATHER(Matrix_A_hat_3d_transp,5*na*size_matrix*nsd/numtasks,MPI_DOUBLE_COMPLEX,&
>      Matrix_A_hat_3d_gath,5*na*size_matrix*nsd/numtasks,MPI_DOUBLE_COMPLEX,0,MPI_COMM_WORLD,ierr)
> call MPI_GATHER(JR_matrix_transp,5*na*size_matrix*nsd/numtasks,MPI_INTEGER,JR_matrix_gath,&
>      5*na*size_matrix*nsd/numtasks,MPI_INTEGER,0,MPI_COMM_WORLD,ierr)
> call MPI_GATHER(JC_matrix_transp,5*na*size_matrix*nsd/numtasks,MPI_INTEGER,JC_matrix_gath,&
>      5*na*size_matrix*nsd/numtasks,MPI_INTEGER,0,MPI_COMM_WORLD,ierr)
> 
> _______________________________________________
> mpich-discuss mailing list     mpich-discuss at mcs.anl.gov
> To manage subscription options or unsubscribe:
> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss



More information about the mpich-discuss mailing list