<div dir="ltr">Yes the machines can ssh. <br>The point to point (Send Recv) also works.<br>Bcast works.<br><br>Barrier has this problem of communication error with rank XX<br><br>And <br><br>Scatter hangs on rank 0.<br><br>
<br>any one encountered with before?<br><br clear="all"><span style="color: rgb(0, 0, 0);">Bibrak </span><span style="color: rgb(0, 0, 0);"><br></span><font color="#888888"></font><br>
<br><br><div class="gmail_quote">On Wed, Dec 7, 2011 at 4:10 PM, Nicolas Rosner <span dir="ltr"><<a href="mailto:nrosner@gmail.com">nrosner@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;">
Hi Bibrak,<br>
<div class="im"><br>
> The application starts on all the machines but cannot<br>
> successfully complete the Barrier().<br>
> What could be the problem, any guess?<br>
<br>
</div>Do nontrivial programs that only do p2p comm work fine, without such<br>
problems? Subject line & example seem to suggest so, but please<br>
confirm.<br>
<br>
Otherwise,<br>
<br>
<a href="http://wiki.mcs.anl.gov/mpich2/index.php/Frequently_Asked_Questions#Q:_My_MPI_program_aborts_with_an_error_saying_it_cannot_communicate_with_other_processes" target="_blank">http://wiki.mcs.anl.gov/mpich2/index.php/Frequently_Asked_Questions#Q:_My_MPI_program_aborts_with_an_error_saying_it_cannot_communicate_with_other_processes</a><br>
<br>
tried that already?<br>
<br>
Hth, N.<br>
<div><div class="h5"><br>
<br>
<br>
On Wed, Dec 7, 2011 at 5:12 AM, Bibrak Qamar <<a href="mailto:bibrakc@gmail.com">bibrakc@gmail.com</a>> wrote:<br>
> Hello all,<br>
><br>
><br>
> I am using mpich2-1.4.1p1 on a cluster of machines. I use mpiexec to run<br>
> application. The application starts on all the machines but cannot<br>
> successfully complete the Barrier().<br>
><br>
> What could be the problem, any guess?<br>
><br>
> Hello world from process 0 of 3 | Hostname = ccitsuseamd1<br>
> Hello world from process 2 of 3 | Hostname = ccitsuse05<br>
> Hello world from process 1 of 3 | Hostname = ccitsuse07<br>
><br>
><br>
> Fatal error in PMPI_Barrier: Other MPI error, error stack:<br>
> PMPI_Barrier(425)...............: MPI_Barrier(MPI_COMM_WORLD) failed<br>
> MPIR_Barrier_impl(331)..........: Failure during collective<br>
> MPIR_Barrier_impl(313)..........:<br>
> MPIR_Barrier_intra(83)..........:<br>
> MPIDI_CH3U_Recvq_FDU_or_AEP(380): Communication error with rank 0<br>
> Hello world After Barrier from process 1 of 3 | Hostname = ccitsuse07<br>
><br>
><br>
><br>
> Thanks<br>
> Bibrak<br>
><br>
</div></div>> _______________________________________________<br>
> mpich-discuss mailing list <a href="mailto:mpich-discuss@mcs.anl.gov">mpich-discuss@mcs.anl.gov</a><br>
> To manage subscription options or unsubscribe:<br>
> <a href="https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss" target="_blank">https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss</a><br>
><br>
_______________________________________________<br>
mpich-discuss mailing list <a href="mailto:mpich-discuss@mcs.anl.gov">mpich-discuss@mcs.anl.gov</a><br>
To manage subscription options or unsubscribe:<br>
<a href="https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss" target="_blank">https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss</a><br>
</blockquote></div><br></div>