[mpich-discuss] Problem running MPI code on multiple nodes

Pavan Balaji balaji at mcs.anl.gov
Wed Nov 23 02:06:58 CST 2011


n13 and n02 are not able to communicate with each other.

  -- Pavan

On 11/18/2011 03:59 PM, Chavez, Andres wrote:
> My fortran code runs fine when restricted to one node, but when I try to
> run on multiple nodes, the following error occurs
> *
> */* when restricted to one host the code runs perfectly/*
> run line*
> mpiexec -hosts n13,n02 -np 4 ./reg
>
> *Error*
> /Fatal error in PMPI_Gather: Other MPI error, error stack:
> PMPI_Gather(863)..........: MPI_Gather(sbuf=0x12cc3e0, scount=5120,
> MPI_DOUBLE_COMPLEX, rbuf=(nil), rcount=5120, MPI_DOUBLE_COMPLEX, root=0,
> MPI_COMM_WORLD) failed
> MPIR_Gather_impl(693).....:
> MPIR_Gather(655)..........:
> MPIR_Gather_intra(283)....:
> MPIC_Send(63).............:
> MPIDI_EagerContigSend(186): failure occurred while attempting to send an
> eager message
> MPIDI_CH3_iStartMsgv(44)..: Communication error with rank 2/
>
>
> These are all the instances of MPI_GATHER
>
> /call
> MPI_GATHER(xi_dot_matrix_transp,na*n_elements*nsd/numtasks,MPI_DOUBLE_COMPLEX,xi_dot_matrix_gath,&
>       na*n_elements*nsd/numtasks,MPI_DOUBLE_COMPLEX,0,MPI_COMM_WORLD,ierr)
> call
> MPI_GATHER(Matrix_A_hat_3d_transp,5*na*size_matrix*nsd/numtasks,MPI_DOUBLE_COMPLEX,&
>
> Matrix_A_hat_3d_gath,5*na*size_matrix*nsd/numtasks,MPI_DOUBLE_COMPLEX,0,MPI_COMM_WORLD,ierr)
> call
> MPI_GATHER(JR_matrix_transp,5*na*size_matrix*nsd/numtasks,MPI_INTEGER,JR_matrix_gath,&
>       5*na*size_matrix*nsd/numtasks,MPI_INTEGER,0,MPI_COMM_WORLD,ierr)
> call
> MPI_GATHER(JC_matrix_transp,5*na*size_matrix*nsd/numtasks,MPI_INTEGER,JC_matrix_gath,&
>       5*na*size_matrix*nsd/numtasks,MPI_INTEGER,0,MPI_COMM_WORLD,ierr)/
>
> Any help is greatly appreciated.
>
> Thank you
>
>
> _______________________________________________
> mpich-discuss mailing list     mpich-discuss at mcs.anl.gov
> To manage subscription options or unsubscribe:
> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss

-- 
Pavan Balaji
http://www.mcs.anl.gov/~balaji


More information about the mpich-discuss mailing list