[mpich-discuss] a question about process-core binding
Pavan Balaji
balaji at mcs.anl.gov
Tue Aug 2 12:14:29 CDT 2011
Please keep mpich-discuss cc'ed. The below error doesn't seem to be a
binding issue. Did you try removing the -binding option to see if it
works without that?
On 08/02/2011 12:12 PM, teng ma wrote:
> thanks for the answer. I met another issue with hydra binding. When
> processes launched exceed 408, it throws error like following:
>
>
> I run it like
> mpiexec -n 408 -binding cpu -f ~/host_mpich ./IMB-MPI1 Bcast -npmin 408
> Fatal error in PMPI_Init_thread: Other MPI error, error stack:
> MPIR_Init_thread(388)..............:
> MPID_Init(139).....................: channel initialization failed
> MPIDI_CH3_Init(38).................:
> MPID_nem_init(234).................:
> MPID_nem_tcp_init(99)..............:
> MPID_nem_tcp_get_business_card(325):
> MPIDI_Get_IP_for_iface(276)........: ioctl failed errno=19 - No such device
> Fatal error in PMPI_Init_thread: Other MPI error, error stack:
> MPIR_Init_thread(388)..............:
> MPID_Init(139).....................: channel initialization failed
> MPIDI_CH3_Init(38).................:
> MPID_nem_init(234).................:
> MPID_nem_tcp_init(99)..............:
> MPID_nem_tcp_get_business_card(325):
> MPIDI_Get_IP_for_iface(276)........: ioctl failed errno=19 - No such device
> Fatal error in PMPI_Init_thread: Other MPI error, error stack:
> MPIR_Init_thread(388)..............:
> MPID_Init(139).....................: channel initialization failed
> MPIDI_CH3_Init(38).................:
> MPID_nem_init(234).................:
> MPID_nem_tcp_init(99)..............:
> MPID_nem_tcp_get_business_card(325):
> MPIDI_Get_IP_for_iface(276)........: ioctl failed errno=19 - No such device
> Fatal error in PMPI_Init_thread: Other MPI error, error stack:
> MPIR_Init_thread(388)..............:
> MPID_Init(139).....................: channel initialization failed
> MPIDI_CH3_Init(38).................:
> MPID_nem_init(234).................:
> MPID_nem_tcp_init(99)..............:
> MPID_nem_tcp_get_business_card(325):
> MPIDI_Get_IP_for_iface(276)........: ioctl failed errno=19 - No such device
> Fatal error in PMPI_Init_thread: Other MPI error, error stack:
> MPIR_Init_thread(388)..............:
> MPID_Init(139).....................: channel initialization failed
> MPIDI_CH3_Init(38).................:
> MPID_nem_init(234).................:
> MPID_nem_tcp_init(99)..............:
> MPID_nem_tcp_get_business_card(325):
> MPIDI_Get_IP_for_iface(276)........: ioctl failed errno=19 - No such device
> Fatal error in PMPI_Init_thread: Other MPI error, error stack:
> MPIR_Init_thread(388)..............:
> MPID_Init(139).....................: channel initialization failed
> MPIDI_CH3_Init(38).................:
> MPID_nem_init(234).................:
> MPID_nem_tcp_init(99)..............:
> MPID_nem_tcp_get_business_card(325):
> MPIDI_Get_IP_for_iface(276)........: ioctl failed errno=19 - No such device
> Fatal error in PMPI_Init_thread: Other MPI error, error stack:
> MPIR_Init_thread(388)..............:
> MPID_Init(139).....................: channel initialization failed
> MPIDI_CH3_Init(38).................:
> MPID_nem_init(234).................:
> MPID_nem_tcp_init(99)..............:
> MPID_nem_tcp_get_business_card(325):
> MPIDI_Get_IP_for_iface(276)........: ioctl failed errno=19 - No such device
> Fatal error in PMPI_Init_thread: Other MPI error, error stack:
> MPIR_Init_thread(388)..............:
> MPID_Init(139).....................: channel initialization failed
> MPIDI_CH3_Init(38).................:
> MPID_nem_init(234).................:
> MPID_nem_tcp_init(99)..............:
> MPID_nem_tcp_get_business_card(325):
> MPIDI_Get_IP_for_iface(276)........: ioctl failed errno=19 - No such device
> Fatal error in PMPI_Init_thread: Other MPI error, error stack:
> MPIR_Init_thread(388)..............:
> MPID_Init(139).....................: channel initialization failed
> MPIDI_CH3_Init(38).................:
> MPID_nem_init(234).................:
> MPID_nem_tcp_init(99)..............:
> MPID_nem_tcp_get_business_card(325):
> MPIDI_Get_IP_for_iface(276)........: ioctl failed errno=19 - No such device
> Fatal error in PMPI_Init_thread: Other MPI error, error stack:
> MPIR_Init_thread(388)..............:
> MPID_Init(139).....................: channel initialization failed
> MPIDI_CH3_Init(38).................:
> MPID_nem_init(234).................:
> MPID_nem_tcp_init(99)..............:
> MPID_nem_tcp_get_business_card(325):
> MPIDI_Get_IP_for_iface(276)........: ioctl failed errno=19 - No such device
> Fatal error in PMPI_Init_thread: Other MPI error, error stack:
> MPIR_Init_thread(388)..............:
> MPID_Init(139).....................: channel initialization failed
> MPIDI_CH3_Init(38).................:
> MPID_nem_init(234).................:
> MPID_nem_tcp_init(99)..............:
> MPID_nem_tcp_get_business_card(325):
> MPIDI_Get_IP_for_iface(276)........: ioctl failed errno=19 - No such device
> Fatal error in PMPI_Init_thread: Other MPI error, error stack:
> MPIR_Init_thread(388)..............:
> MPID_Init(139).....................: channel initialization failed
> MPIDI_CH3_Init(38).................:
> MPID_nem_init(234).................:
> MPID_nem_tcp_init(99)..............:
> MPID_nem_tcp_get_business_card(325):
> MPIDI_Get_IP_for_iface(276)........: ioctl failed errno=19 - No such device
> Fatal error in PMPI_Init_thread: Other MPI error, error stack:
> MPIR_Init_thread(388)..............:
> MPID_Init(139).....................: channel initialization failed
> MPIDI_CH3_Init(38).................:
> MPID_nem_init(234).................:
> MPID_nem_tcp_init(99)..............:
> MPID_nem_tcp_get_business_card(325):
> MPIDI_Get_IP_for_iface(276)........: ioctl failed errno=19 - No such device
>
>
> When processes is less than 407, -binding cpu/rr looks good. If I
> remove -binding cpu/rr, just with -f ~/host_mpich, it's still ok no
> matter how many processes. My host_mpich is like:
>
> stremi-7.reims.grid5000.fr:24 <http://stremi-7.reims.grid5000.fr:24>
> stremi-35.reims.grid5000.fr:24 <http://stremi-35.reims.grid5000.fr:24>
> stremi-28.reims.grid5000.fr:24 <http://stremi-28.reims.grid5000.fr:24>
> stremi-38.reims.grid5000.fr:24 <http://stremi-38.reims.grid5000.fr:24>
> stremi-32.reims.grid5000.fr:24 <http://stremi-32.reims.grid5000.fr:24>
> stremi-26.reims.grid5000.fr:24 <http://stremi-26.reims.grid5000.fr:24>
> stremi-22.reims.grid5000.fr:24 <http://stremi-22.reims.grid5000.fr:24>
> stremi-43.reims.grid5000.fr:24 <http://stremi-43.reims.grid5000.fr:24>
> stremi-30.reims.grid5000.fr:24 <http://stremi-30.reims.grid5000.fr:24>
> stremi-41.reims.grid5000.fr:24 <http://stremi-41.reims.grid5000.fr:24>
> stremi-4.reims.grid5000.fr:24 <http://stremi-4.reims.grid5000.fr:24>
> stremi-34.reims.grid5000.fr:24 <http://stremi-34.reims.grid5000.fr:24>
> stremi-24.reims.grid5000.fr:24 <http://stremi-24.reims.grid5000.fr:24>
> stremi-23.reims.grid5000.fr:24 <http://stremi-23.reims.grid5000.fr:24>
> stremi-20.reims.grid5000.fr:24 <http://stremi-20.reims.grid5000.fr:24>
> stremi-36.reims.grid5000.fr:24 <http://stremi-36.reims.grid5000.fr:24>
> stremi-29.reims.grid5000.fr:24 <http://stremi-29.reims.grid5000.fr:24>
> stremi-19.reims.grid5000.fr:24 <http://stremi-19.reims.grid5000.fr:24>
> stremi-42.reims.grid5000.fr:24 <http://stremi-42.reims.grid5000.fr:24>
> stremi-39.reims.grid5000.fr:24 <http://stremi-39.reims.grid5000.fr:24>
> stremi-27.reims.grid5000.fr:24 <http://stremi-27.reims.grid5000.fr:24>
> stremi-44.reims.grid5000.fr:24 <http://stremi-44.reims.grid5000.fr:24>
> stremi-37.reims.grid5000.fr:24 <http://stremi-37.reims.grid5000.fr:24>
> stremi-31.reims.grid5000.fr:24 <http://stremi-31.reims.grid5000.fr:24>
> stremi-6.reims.grid5000.fr:24 <http://stremi-6.reims.grid5000.fr:24>
> stremi-33.reims.grid5000.fr:24 <http://stremi-33.reims.grid5000.fr:24>
> stremi-3.reims.grid5000.fr:24 <http://stremi-3.reims.grid5000.fr:24>
> stremi-2.reims.grid5000.fr:24 <http://stremi-2.reims.grid5000.fr:24>
> stremi-40.reims.grid5000.fr:24 <http://stremi-40.reims.grid5000.fr:24>
> stremi-21.reims.grid5000.fr:24 <http://stremi-21.reims.grid5000.fr:24>
> stremi-5.reims.grid5000.fr:24 <http://stremi-5.reims.grid5000.fr:24>
> stremi-25.reims.grid5000.fr:24 <http://stremi-25.reims.grid5000.fr:24>
>
> The configure of mpich2 is just default configure.
>
> Thanks
> Teng
>
> On Tue, Aug 2, 2011 at 12:43 PM, Pavan Balaji <balaji at mcs.anl.gov
> <mailto:balaji at mcs.anl.gov>> wrote:
>
>
> mpiexec -binding rr
>
> -- Pavan
>
>
> On 08/02/2011 11:35 AM, teng ma wrote:
>
> If I want to do a process-core binding like MVAPICH2's scatter way:
> assign MPI ranks by nodes in host file, e.g.
> host1
> host2
> host3
>
> rank 0 host 1's core 0
> rank 1 host 2's core 0
> rank 2 host 3's core 0
> rank 3 host 1's core 1
> rank 4 host 2's core 1
> rank 5 host 3's core 1
>
> Is there any easy method in mpich2-1.4 to achieve this binding?
>
> Teng Ma
>
>
>
> _________________________________________________
> mpich-discuss mailing list
> mpich-discuss at mcs.anl.gov <mailto:mpich-discuss at mcs.anl.gov>
> https://lists.mcs.anl.gov/__mailman/listinfo/mpich-discuss
> <https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss>
>
>
> --
> Pavan Balaji
> http://www.mcs.anl.gov/~balaji <http://www.mcs.anl.gov/%7Ebalaji>
>
>
--
Pavan Balaji
http://www.mcs.anl.gov/~balaji
More information about the mpich-discuss
mailing list