[mpich-discuss] Collective comm not working

Bibrak Qamar bibrakc at gmail.com
Wed Dec 7 07:04:12 CST 2011


Yes the machines can ssh.
The point to point (Send Recv) also works.
Bcast works.

Barrier has this problem of communication error with rank XX

And

Scatter hangs on rank 0.


any one encountered with before?

Bibrak



On Wed, Dec 7, 2011 at 4:10 PM, Nicolas Rosner <nrosner at gmail.com> wrote:

> Hi Bibrak,
>
> > The application starts on all the machines but cannot
> > successfully complete the Barrier().
> > What could be the problem, any guess?
>
> Do nontrivial programs that only do p2p comm work fine, without such
> problems? Subject line & example seem to suggest so, but please
> confirm.
>
> Otherwise,
>
>
> http://wiki.mcs.anl.gov/mpich2/index.php/Frequently_Asked_Questions#Q:_My_MPI_program_aborts_with_an_error_saying_it_cannot_communicate_with_other_processes
>
> tried that already?
>
> Hth, N.
>
>
>
> On Wed, Dec 7, 2011 at 5:12 AM, Bibrak Qamar <bibrakc at gmail.com> wrote:
> > Hello all,
> >
> >
> > I am using mpich2-1.4.1p1 on a cluster of machines. I use mpiexec to run
> > application. The application starts on all the machines but cannot
> > successfully complete the Barrier().
> >
> > What could be the problem, any guess?
> >
> > Hello world from process 0 of 3 | Hostname = ccitsuseamd1
> > Hello world from process 2 of 3 | Hostname = ccitsuse05
> > Hello world from process 1 of 3 | Hostname = ccitsuse07
> >
> >
> > Fatal error in PMPI_Barrier: Other MPI error, error stack:
> > PMPI_Barrier(425)...............: MPI_Barrier(MPI_COMM_WORLD) failed
> > MPIR_Barrier_impl(331)..........: Failure during collective
> > MPIR_Barrier_impl(313)..........:
> > MPIR_Barrier_intra(83)..........:
> > MPIDI_CH3U_Recvq_FDU_or_AEP(380): Communication error with rank 0
> > Hello world After Barrier from process 1 of 3 | Hostname = ccitsuse07
> >
> >
> >
> > Thanks
> > Bibrak
> >
> > _______________________________________________
> > mpich-discuss mailing list     mpich-discuss at mcs.anl.gov
> > To manage subscription options or unsubscribe:
> > https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
> >
> _______________________________________________
> mpich-discuss mailing list     mpich-discuss at mcs.anl.gov
> To manage subscription options or unsubscribe:
> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20111207/e3cd5f03/attachment.htm>


More information about the mpich-discuss mailing list