[mpich-discuss] Problem while running example program Cpi with more than 1 task

Pavan Balaji balaji at mcs.anl.gov
Thu Sep 2 11:16:39 CDT 2010


The difference in the working case and the failing cases is the use of 
shared memory -- the working case has no shared memory usage since you 
don't have multiple processes on the same node.

Can you try out the 1.3b1 release of Hydra before we go digging into this?

  -- Pavan

On 09/02/2010 05:02 AM, Thejna Tharammal wrote:
> Hi,
>
> I installed mpich2-1.2.1p1 on a Linux cluster with 6 nodes, (Intel
> xeon,3Gz/node, 64bit, Kernel 2.6.18-128.1.6.el5), with pgf90+pgcc compilers.
>
> While testing the example program cpi with more than 2 tasks, it shows
> the error,
>
> ================
>
> mpiexec -l -n 2 -host k4 ./cpi
> 0: Process 0 of 2 is on k4
> 1: Process 1 of 2 is on k4
> 0: pi is approximately 3.1415926544231318, Error is 0.0000000008333387
> 0: wall clock time = 0.000232
> rank 1 in job 10 k1_37752 caused collective abort of all ranks
> exit status of rank 1: killed by signal 11
> rank 0 in job 10 k1_37752 caused collective abort of all ranks
> =================
>
> And when I try with 2 hosts,
>
> mpiexec -l -n 2 -host k6 ./cpi : -n 2 -host k4 ./cpi
> 0: Process 0 of 4 is on k6
> 1: Process 1 of 4 is on k6
> 3: Process 3 of 4 is on k4
> 2: Process 2 of 4 is on k4
> 0: pi is approximately 3.1415926544231239, Error is 0.0000000008333307
> 0: wall clock time = 0.001073
> rank 0 in job 13 k1_37752 caused collective abort of all ranks
> exit status of rank 0: killed by signal 11
> ===================
>
> While the same with 1 task each works fine, like
>
> mpiexec -l -n 1 -host k6 ./cpi : -n 1 -host k4 ./cpi
> 0: Process 0 of 2 is on k6
> 1: Process 1 of 2 is on k4
> 0: pi is approximately 3.1415926544231318, Error is 0.0000000008333387
> 0: wall clock time = 0.033167
>
> What could be the reason for this?
>
> Thank you,
>
> Thejna.
>
>
>
> _______________________________________________
> mpich-discuss mailing list
> mpich-discuss at mcs.anl.gov
> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss

-- 
Pavan Balaji
http://www.mcs.anl.gov/~balaji


More information about the mpich-discuss mailing list