[mpich-discuss] Problem while running example program Cpi with more than 1 task

Thejna Tharammal ttharammal at marum.de
Thu Sep 2 05:02:32 CDT 2010


Hi,

I installed mpich2-1.2.1p1 on a Linux cluster with 6 nodes, (Intel
xeon,3Gz/node, 64bit, Kernel 2.6.18-128.1.6.el5), with pgf90+pgcc compilers.

While testing the example program cpi with more than 2 tasks, it shows the
error,

================

mpiexec -l -n 2 -host k4 ./cpi
0: Process 0 of 2 is on k4
1: Process 1 of 2 is on k4
0: pi is approximately 3.1415926544231318, Error is 0.0000000008333387
0: wall clock time = 0.000232
rank 1 in job 10 k1_37752 caused collective abort of all ranks
 exit status of rank 1: killed by signal 11
rank 0 in job 10 k1_37752 caused collective abort of all ranks
=================

And when I try with 2 hosts,

mpiexec -l -n 2 -host k6 ./cpi : -n 2 -host k4 ./cpi
0: Process 0 of 4 is on k6
1: Process 1 of 4 is on k6
3: Process 3 of 4 is on k4
2: Process 2 of 4 is on k4
0: pi is approximately 3.1415926544231239, Error is 0.0000000008333307
0: wall clock time = 0.001073
rank 0 in job 13 k1_37752 caused collective abort of all ranks
 exit status of rank 0: killed by signal 11
===================

While the same with 1 task each works fine, like

mpiexec -l -n 1 -host k6 ./cpi : -n 1 -host k4 ./cpi
0: Process 0 of 2 is on k6
1: Process 1 of 2 is on k4
0: pi is approximately 3.1415926544231318, Error is 0.0000000008333387
0: wall clock time = 0.033167
 

What could be the reason for this?

Thank you,

Thejna.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20100902/ce873c67/attachment.htm>


More information about the mpich-discuss mailing list