[mpich-discuss] Problem while running example program Cpi with morethan 1 task
Thejna Tharammal
ttharammal at marum.de
Fri Sep 3 07:43:11 CDT 2010
Ok, I tried that,
No.of hosts 1:
-bash-3.2$ mpiexec -n 7 ./cpi
Process 1 of 7 is on k1
Process 4 of 7 is on k1
Process 5 of 7 is on k1
Process 2 of 7 is on k1
Process 6 of 7 is on k1
Process 0 of 7 is on k1
Process 3 of 7 is on k1
pi is approximately 3.1415926544231239, Error is 0.0000000008333307
wall clock time = 0.000198
APPLICATION TERMINATED WITH THE EXIT STRING: Terminated (signal 15)
===========
No.of hosts 2:
-bash-3.2$ mpiexec -n 7 ./cpi
Process 0 of 7 is on k1
Process 4 of 7 is on k1
Process 2 of 7 is on k1
Process 6 of 7 is on k1
pi is approximately 3.1415926544231239, Error is 0.0000000008333307
wall clock time = 0.000522
Process 1 of 7 is on k2
Process 5 of 7 is on k2
Process 3 of 7 is on k2
APPLICATION TERMINATED WITH THE EXIT STRING: Terminated (signal 15)
========
No.of hosts 3:
-bash-3.2$ mpiexec -n 7 ./cpi
Process 4 of 7 is on k2
Process 1 of 7 is on k2
Process 2 of 7 is on k3
Process 5 of 7 is on k3
Process 0 of 7 is on k1
Process 3 of 7 is on k1
Process 6 of 7 is on k1
pi is approximately 3.1415926544231239, Error is 0.0000000008333307
wall clock time = 0.000855
APPLICATION TERMINATED WITH THE EXIT STRING: Terminated (signal 15)
====================
No.of hosts 4:
-bash-3.2$ mpiexec -n 7 ./cpi
Process 0 of 7 is on k1
Process 4 of 7 is on k1
Process 3 of 7 is on k4
Process 5 of 7 is on k2
Process 1 of 7 is on k2
Process 2 of 7 is on k3
Process 6 of 7 is on k3
Fatal error in PMPI_Reduce: Other MPI error, error stack:
PMPI_Reduce(1207).................: MPI_Reduce(sbuf=0x7fff66b3c220,
rbuf=0x7fff66b3c228, count=1, MPI_DOUBLE, MPI_SUM, root=0, MPI_COMM_WORLD)
failed
MPIR_Reduce_impl(1025)............:
MPIR_Reduce_intra(803)............:
MPIR_Reduce_impl(1025)............:
MPIR_Reduce_intra(855)............:
MPIR_Reduce_binomial(171).........:
MPIC_Recv(108)....................:
MPIC_Wait(534)....................:
MPIDI_CH3I_Progress(184)..........:
MPID_nem_mpich2_blocking_recv(895):
MPID_nem_tcp_connpoll(1760).......:
state_commrdy_handler(1588).......:
MPID_nem_tcp_recv_handler(1567)...: Communication error with rank 1
MPID_nem_tcp_recv_handler(1467)...: socket closed
Fatal error in PMPI_Bcast: Other MPI error, error stack:
PMPI_Bcast(1314)......................: MPI_Bcast(buf=0x7fffed0286f8,
count=1, MPI_INT, root=0, MPI_COMM_WORLD) failed
MPIR_Bcast_impl(1158).................:
MPIR_Bcast_intra(996).................:
MPIR_Bcast_scatter_ring_allgather(840):
MPIR_Bcast_binomial(187)..............:
MPIC_Send(66).........................:
MPIC_Wait(534)........................:
MPIDI_CH3I_Progress(184)..............:
MPID_nem_mpich2_blocking_recv(895)....:
MPID_nem_tcp_connpoll(1760)...........:
state_commrdy_handler(1588)...........:
MPID_nem_tcp_recv_handler(1567).......: Communication error with rank 0
MPID_nem_tcp_recv_handler(1467).......: socket closed
Fatal error in PMPI_Bcast: Other MPI error, error stack:
PMPI_Bcast(1314)......................: MPI_Bcast(buf=0x7fff5f6ac518,
count=1, MPI_INT, root=0, MPI_COMM_WORLD) failed
MPIR_Bcast_impl(1158).................:
MPIR_Bcast_intra(996).................:
MPIR_Bcast_scatter_ring_allgather(840):
MPIR_Bcast_binomial(157)..............:
MPIC_Recv(108)........................:
MPIC_Wait(534)........................:
MPIDI_CH3I_Progress(184)..............:
MPID_nem_mpich2_blocking_recv(895)....:
MPID_nem_tcp_connpoll(1746)...........: Communication error with rank 2:
APPLICATION TERMINATED WITH THE EXIT STRING: Terminated (signal 15)
==============
The same error messages are produced when I use the same no. of hosts (6) ie
like -
Fatal error in PMPI_Reduce: Other MPI error, error stack:
PMPI_Reduce(1207).................: MPI_Reduce(sbuf=0x7fffc6433c20,
rbuf=0x7fffc6433c28, count=1, MPI_DOUBLE, MPI_SUM, root=0, MPI_COMM_WORLD)
failed
MPIR_Reduce_impl(1025)............:
MPIR_Reduce_intra(803)............:
MPIR_Reduce_impl(1025)............:
MPIR_Reduce_intra(855)............:
MPIR_Reduce_binomial(171).........:
MPIC_Recv(108)....................:
MPIC_Wait(534)....................:
MPIDI_CH3I_Progress(184)..........:
MPID_nem_mpich2_blocking_recv(895):
MPID_nem_tcp_connpoll(1760).......:
state_commrdy_handler(1588).......:
MPID_nem_tcp_recv_handler(1567)...: Communication error with rank 2
MPID_nem_tcp_recv_handler(1467)...: socket closed
Fatal error in MPI_Finalize: Other MPI error, error stack:
MPI_Finalize(302).................: MPI_Finalize failed
============================================
Thank you,
Thejna
----------------original message-----------------
From: "Pavan Balaji" balaji at mcs.anl.gov
To: "Thejna Tharammal" ttharammal at marum.de
CC: mpich-discuss at mcs.anl.gov
Date: Thu, 02 Sep 2010 17:30:53 -0500
-------------------------------------------------
>
> On 09/02/2010 05:18 PM, Thejna Tharammal wrote:
>> Yes I run the 1.3b1 version,
>> I have set environment var HYDRA_HOST_FILE in the bash profile, so It's
>> running on the machines I specify(k1-k6) itself I think..
>
> Ah, I see. This works fine for me (though I'm using the svn trunk, and
> not 1.3b1).
>
> Can you try using a smaller host file (with lesser hosts) and see if you
> can reproduce this problem?
>
> -- Pavan
>
> --
> Pavan Balaji
> http://www.mcs.anl.gov/~balaji
>
More information about the mpich-discuss
mailing list