[mpich-discuss] Collective comm not working

Bibrak Qamar bibrakc at gmail.com
Wed Dec 7 02:12:03 CST 2011


Hello all,


I am using mpich2-1.4.1p1 on a cluster of machines. I use mpiexec to run
application. The application starts on all the machines but cannot
successfully complete the Barrier().

What could be the problem, any guess?

Hello world from process 0 of 3 | Hostname = ccitsuseamd1
Hello world from process 2 of 3 | Hostname = ccitsuse05
Hello world from process 1 of 3 | Hostname = ccitsuse07


Fatal error in PMPI_Barrier: Other MPI error, error stack:
PMPI_Barrier(425)...............: MPI_Barrier(MPI_COMM_WORLD) failed
MPIR_Barrier_impl(331)..........: Failure during collective
MPIR_Barrier_impl(313)..........:
MPIR_Barrier_intra(83)..........:
MPIDI_CH3U_Recvq_FDU_or_AEP(380): Communication error with rank 0
Hello world After Barrier from process 1 of 3 | Hostname = ccitsuse07



Thanks
Bibrak
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20111207/a1e1e4f6/attachment.htm>


More information about the mpich-discuss mailing list