If -binding is removed, it&#39;s no problem to scale to 768 processes. (32 nodes, 24 core /node). if without binding parameter, what kind of binding strategy mpich2 will use? ( fill out all slots of one nodes, and then another node,   or round robin along nodes?)<br>
<br>Thanks<br>Teng <br><br><div class="gmail_quote">On Tue, Aug 2, 2011 at 1:14 PM, Pavan Balaji <span dir="ltr">&lt;<a href="mailto:balaji@mcs.anl.gov">balaji@mcs.anl.gov</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;">
<br>
Please keep mpich-discuss cc&#39;ed. The below error doesn&#39;t seem to be a binding issue. Did you try removing the -binding option to see if it works without that?<div><div></div><div class="h5"><br>
<br>
On 08/02/2011 12:12 PM, teng ma wrote:<br>
</div></div><blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;"><div><div></div><div class="h5">
thanks for the answer. I met another issue with hydra binding. When<br>
processes launched exceed 408,  it throws error like following:<br>
<br>
<br>
I run it like<br>
mpiexec -n 408 -binding cpu -f ~/host_mpich ./IMB-MPI1 Bcast -npmin 408<br>
Fatal error in PMPI_Init_thread: Other MPI error, error stack:<br>
MPIR_Init_thread(388).........<u></u>.....:<br>
MPID_Init(139)................<u></u>.....: channel initialization failed<br>
MPIDI_CH3_Init(38)............<u></u>.....:<br>
MPID_nem_init(234)............<u></u>.....:<br>
MPID_nem_tcp_init(99).........<u></u>.....:<br>
MPID_nem_tcp_get_business_<u></u>card(325):<br>
MPIDI_Get_IP_for_iface(276)...<u></u>.....: ioctl failed errno=19 - No such device<br>
Fatal error in PMPI_Init_thread: Other MPI error, error stack:<br>
MPIR_Init_thread(388).........<u></u>.....:<br>
MPID_Init(139)................<u></u>.....: channel initialization failed<br>
MPIDI_CH3_Init(38)............<u></u>.....:<br>
MPID_nem_init(234)............<u></u>.....:<br>
MPID_nem_tcp_init(99).........<u></u>.....:<br>
MPID_nem_tcp_get_business_<u></u>card(325):<br>
MPIDI_Get_IP_for_iface(276)...<u></u>.....: ioctl failed errno=19 - No such device<br>
Fatal error in PMPI_Init_thread: Other MPI error, error stack:<br>
MPIR_Init_thread(388).........<u></u>.....:<br>
MPID_Init(139)................<u></u>.....: channel initialization failed<br>
MPIDI_CH3_Init(38)............<u></u>.....:<br>
MPID_nem_init(234)............<u></u>.....:<br>
MPID_nem_tcp_init(99).........<u></u>.....:<br>
MPID_nem_tcp_get_business_<u></u>card(325):<br>
MPIDI_Get_IP_for_iface(276)...<u></u>.....: ioctl failed errno=19 - No such device<br>
Fatal error in PMPI_Init_thread: Other MPI error, error stack:<br>
MPIR_Init_thread(388).........<u></u>.....:<br>
MPID_Init(139)................<u></u>.....: channel initialization failed<br>
MPIDI_CH3_Init(38)............<u></u>.....:<br>
MPID_nem_init(234)............<u></u>.....:<br>
MPID_nem_tcp_init(99).........<u></u>.....:<br>
MPID_nem_tcp_get_business_<u></u>card(325):<br>
MPIDI_Get_IP_for_iface(276)...<u></u>.....: ioctl failed errno=19 - No such device<br>
Fatal error in PMPI_Init_thread: Other MPI error, error stack:<br>
MPIR_Init_thread(388).........<u></u>.....:<br>
MPID_Init(139)................<u></u>.....: channel initialization failed<br>
MPIDI_CH3_Init(38)............<u></u>.....:<br>
MPID_nem_init(234)............<u></u>.....:<br>
MPID_nem_tcp_init(99).........<u></u>.....:<br>
MPID_nem_tcp_get_business_<u></u>card(325):<br>
MPIDI_Get_IP_for_iface(276)...<u></u>.....: ioctl failed errno=19 - No such device<br>
Fatal error in PMPI_Init_thread: Other MPI error, error stack:<br>
MPIR_Init_thread(388).........<u></u>.....:<br>
MPID_Init(139)................<u></u>.....: channel initialization failed<br>
MPIDI_CH3_Init(38)............<u></u>.....:<br>
MPID_nem_init(234)............<u></u>.....:<br>
MPID_nem_tcp_init(99).........<u></u>.....:<br>
MPID_nem_tcp_get_business_<u></u>card(325):<br>
MPIDI_Get_IP_for_iface(276)...<u></u>.....: ioctl failed errno=19 - No such device<br>
Fatal error in PMPI_Init_thread: Other MPI error, error stack:<br>
MPIR_Init_thread(388).........<u></u>.....:<br>
MPID_Init(139)................<u></u>.....: channel initialization failed<br>
MPIDI_CH3_Init(38)............<u></u>.....:<br>
MPID_nem_init(234)............<u></u>.....:<br>
MPID_nem_tcp_init(99).........<u></u>.....:<br>
MPID_nem_tcp_get_business_<u></u>card(325):<br>
MPIDI_Get_IP_for_iface(276)...<u></u>.....: ioctl failed errno=19 - No such device<br>
Fatal error in PMPI_Init_thread: Other MPI error, error stack:<br>
MPIR_Init_thread(388).........<u></u>.....:<br>
MPID_Init(139)................<u></u>.....: channel initialization failed<br>
MPIDI_CH3_Init(38)............<u></u>.....:<br>
MPID_nem_init(234)............<u></u>.....:<br>
MPID_nem_tcp_init(99).........<u></u>.....:<br>
MPID_nem_tcp_get_business_<u></u>card(325):<br>
MPIDI_Get_IP_for_iface(276)...<u></u>.....: ioctl failed errno=19 - No such device<br>
Fatal error in PMPI_Init_thread: Other MPI error, error stack:<br>
MPIR_Init_thread(388).........<u></u>.....:<br>
MPID_Init(139)................<u></u>.....: channel initialization failed<br>
MPIDI_CH3_Init(38)............<u></u>.....:<br>
MPID_nem_init(234)............<u></u>.....:<br>
MPID_nem_tcp_init(99).........<u></u>.....:<br>
MPID_nem_tcp_get_business_<u></u>card(325):<br>
MPIDI_Get_IP_for_iface(276)...<u></u>.....: ioctl failed errno=19 - No such device<br>
Fatal error in PMPI_Init_thread: Other MPI error, error stack:<br>
MPIR_Init_thread(388).........<u></u>.....:<br>
MPID_Init(139)................<u></u>.....: channel initialization failed<br>
MPIDI_CH3_Init(38)............<u></u>.....:<br>
MPID_nem_init(234)............<u></u>.....:<br>
MPID_nem_tcp_init(99).........<u></u>.....:<br>
MPID_nem_tcp_get_business_<u></u>card(325):<br>
MPIDI_Get_IP_for_iface(276)...<u></u>.....: ioctl failed errno=19 - No such device<br>
Fatal error in PMPI_Init_thread: Other MPI error, error stack:<br>
MPIR_Init_thread(388).........<u></u>.....:<br>
MPID_Init(139)................<u></u>.....: channel initialization failed<br>
MPIDI_CH3_Init(38)............<u></u>.....:<br>
MPID_nem_init(234)............<u></u>.....:<br>
MPID_nem_tcp_init(99).........<u></u>.....:<br>
MPID_nem_tcp_get_business_<u></u>card(325):<br>
MPIDI_Get_IP_for_iface(276)...<u></u>.....: ioctl failed errno=19 - No such device<br>
Fatal error in PMPI_Init_thread: Other MPI error, error stack:<br>
MPIR_Init_thread(388).........<u></u>.....:<br>
MPID_Init(139)................<u></u>.....: channel initialization failed<br>
MPIDI_CH3_Init(38)............<u></u>.....:<br>
MPID_nem_init(234)............<u></u>.....:<br>
MPID_nem_tcp_init(99).........<u></u>.....:<br>
MPID_nem_tcp_get_business_<u></u>card(325):<br>
MPIDI_Get_IP_for_iface(276)...<u></u>.....: ioctl failed errno=19 - No such device<br>
Fatal error in PMPI_Init_thread: Other MPI error, error stack:<br>
MPIR_Init_thread(388).........<u></u>.....:<br>
MPID_Init(139)................<u></u>.....: channel initialization failed<br>
MPIDI_CH3_Init(38)............<u></u>.....:<br>
MPID_nem_init(234)............<u></u>.....:<br>
MPID_nem_tcp_init(99).........<u></u>.....:<br>
MPID_nem_tcp_get_business_<u></u>card(325):<br>
MPIDI_Get_IP_for_iface(276)...<u></u>.....: ioctl failed errno=19 - No such device<br>
<br>
<br>
When processes is less than 407, -binding cpu/rr looks good.   If I<br>
remove -binding cpu/rr, just with -f ~/host_mpich, it&#39;s still ok no<br>
matter how many processes. My host_mpich is like:<br>
<br>
</div></div><a href="http://stremi-7.reims.grid5000.fr:24" target="_blank">stremi-7.reims.grid5000.fr:24</a> &lt;<a href="http://stremi-7.reims.grid5000.fr:24" target="_blank">http://stremi-7.reims.<u></u>grid5000.fr:24</a>&gt;<br>

<a href="http://stremi-35.reims.grid5000.fr:24" target="_blank">stremi-35.reims.grid5000.fr:24</a> &lt;<a href="http://stremi-35.reims.grid5000.fr:24" target="_blank">http://stremi-35.reims.<u></u>grid5000.fr:24</a>&gt;<br>

<a href="http://stremi-28.reims.grid5000.fr:24" target="_blank">stremi-28.reims.grid5000.fr:24</a> &lt;<a href="http://stremi-28.reims.grid5000.fr:24" target="_blank">http://stremi-28.reims.<u></u>grid5000.fr:24</a>&gt;<br>

<a href="http://stremi-38.reims.grid5000.fr:24" target="_blank">stremi-38.reims.grid5000.fr:24</a> &lt;<a href="http://stremi-38.reims.grid5000.fr:24" target="_blank">http://stremi-38.reims.<u></u>grid5000.fr:24</a>&gt;<br>

<a href="http://stremi-32.reims.grid5000.fr:24" target="_blank">stremi-32.reims.grid5000.fr:24</a> &lt;<a href="http://stremi-32.reims.grid5000.fr:24" target="_blank">http://stremi-32.reims.<u></u>grid5000.fr:24</a>&gt;<br>

<a href="http://stremi-26.reims.grid5000.fr:24" target="_blank">stremi-26.reims.grid5000.fr:24</a> &lt;<a href="http://stremi-26.reims.grid5000.fr:24" target="_blank">http://stremi-26.reims.<u></u>grid5000.fr:24</a>&gt;<br>

<a href="http://stremi-22.reims.grid5000.fr:24" target="_blank">stremi-22.reims.grid5000.fr:24</a> &lt;<a href="http://stremi-22.reims.grid5000.fr:24" target="_blank">http://stremi-22.reims.<u></u>grid5000.fr:24</a>&gt;<br>

<a href="http://stremi-43.reims.grid5000.fr:24" target="_blank">stremi-43.reims.grid5000.fr:24</a> &lt;<a href="http://stremi-43.reims.grid5000.fr:24" target="_blank">http://stremi-43.reims.<u></u>grid5000.fr:24</a>&gt;<br>

<a href="http://stremi-30.reims.grid5000.fr:24" target="_blank">stremi-30.reims.grid5000.fr:24</a> &lt;<a href="http://stremi-30.reims.grid5000.fr:24" target="_blank">http://stremi-30.reims.<u></u>grid5000.fr:24</a>&gt;<br>

<a href="http://stremi-41.reims.grid5000.fr:24" target="_blank">stremi-41.reims.grid5000.fr:24</a> &lt;<a href="http://stremi-41.reims.grid5000.fr:24" target="_blank">http://stremi-41.reims.<u></u>grid5000.fr:24</a>&gt;<br>

<a href="http://stremi-4.reims.grid5000.fr:24" target="_blank">stremi-4.reims.grid5000.fr:24</a> &lt;<a href="http://stremi-4.reims.grid5000.fr:24" target="_blank">http://stremi-4.reims.<u></u>grid5000.fr:24</a>&gt;<br>
<a href="http://stremi-34.reims.grid5000.fr:24" target="_blank">stremi-34.reims.grid5000.fr:24</a> &lt;<a href="http://stremi-34.reims.grid5000.fr:24" target="_blank">http://stremi-34.reims.<u></u>grid5000.fr:24</a>&gt;<br>

<a href="http://stremi-24.reims.grid5000.fr:24" target="_blank">stremi-24.reims.grid5000.fr:24</a> &lt;<a href="http://stremi-24.reims.grid5000.fr:24" target="_blank">http://stremi-24.reims.<u></u>grid5000.fr:24</a>&gt;<br>

<a href="http://stremi-23.reims.grid5000.fr:24" target="_blank">stremi-23.reims.grid5000.fr:24</a> &lt;<a href="http://stremi-23.reims.grid5000.fr:24" target="_blank">http://stremi-23.reims.<u></u>grid5000.fr:24</a>&gt;<br>

<a href="http://stremi-20.reims.grid5000.fr:24" target="_blank">stremi-20.reims.grid5000.fr:24</a> &lt;<a href="http://stremi-20.reims.grid5000.fr:24" target="_blank">http://stremi-20.reims.<u></u>grid5000.fr:24</a>&gt;<br>

<a href="http://stremi-36.reims.grid5000.fr:24" target="_blank">stremi-36.reims.grid5000.fr:24</a> &lt;<a href="http://stremi-36.reims.grid5000.fr:24" target="_blank">http://stremi-36.reims.<u></u>grid5000.fr:24</a>&gt;<br>

<a href="http://stremi-29.reims.grid5000.fr:24" target="_blank">stremi-29.reims.grid5000.fr:24</a> &lt;<a href="http://stremi-29.reims.grid5000.fr:24" target="_blank">http://stremi-29.reims.<u></u>grid5000.fr:24</a>&gt;<br>

<a href="http://stremi-19.reims.grid5000.fr:24" target="_blank">stremi-19.reims.grid5000.fr:24</a> &lt;<a href="http://stremi-19.reims.grid5000.fr:24" target="_blank">http://stremi-19.reims.<u></u>grid5000.fr:24</a>&gt;<br>

<a href="http://stremi-42.reims.grid5000.fr:24" target="_blank">stremi-42.reims.grid5000.fr:24</a> &lt;<a href="http://stremi-42.reims.grid5000.fr:24" target="_blank">http://stremi-42.reims.<u></u>grid5000.fr:24</a>&gt;<br>

<a href="http://stremi-39.reims.grid5000.fr:24" target="_blank">stremi-39.reims.grid5000.fr:24</a> &lt;<a href="http://stremi-39.reims.grid5000.fr:24" target="_blank">http://stremi-39.reims.<u></u>grid5000.fr:24</a>&gt;<br>

<a href="http://stremi-27.reims.grid5000.fr:24" target="_blank">stremi-27.reims.grid5000.fr:24</a> &lt;<a href="http://stremi-27.reims.grid5000.fr:24" target="_blank">http://stremi-27.reims.<u></u>grid5000.fr:24</a>&gt;<br>

<a href="http://stremi-44.reims.grid5000.fr:24" target="_blank">stremi-44.reims.grid5000.fr:24</a> &lt;<a href="http://stremi-44.reims.grid5000.fr:24" target="_blank">http://stremi-44.reims.<u></u>grid5000.fr:24</a>&gt;<br>

<a href="http://stremi-37.reims.grid5000.fr:24" target="_blank">stremi-37.reims.grid5000.fr:24</a> &lt;<a href="http://stremi-37.reims.grid5000.fr:24" target="_blank">http://stremi-37.reims.<u></u>grid5000.fr:24</a>&gt;<br>

<a href="http://stremi-31.reims.grid5000.fr:24" target="_blank">stremi-31.reims.grid5000.fr:24</a> &lt;<a href="http://stremi-31.reims.grid5000.fr:24" target="_blank">http://stremi-31.reims.<u></u>grid5000.fr:24</a>&gt;<br>

<a href="http://stremi-6.reims.grid5000.fr:24" target="_blank">stremi-6.reims.grid5000.fr:24</a> &lt;<a href="http://stremi-6.reims.grid5000.fr:24" target="_blank">http://stremi-6.reims.<u></u>grid5000.fr:24</a>&gt;<br>
<a href="http://stremi-33.reims.grid5000.fr:24" target="_blank">stremi-33.reims.grid5000.fr:24</a> &lt;<a href="http://stremi-33.reims.grid5000.fr:24" target="_blank">http://stremi-33.reims.<u></u>grid5000.fr:24</a>&gt;<br>

<a href="http://stremi-3.reims.grid5000.fr:24" target="_blank">stremi-3.reims.grid5000.fr:24</a> &lt;<a href="http://stremi-3.reims.grid5000.fr:24" target="_blank">http://stremi-3.reims.<u></u>grid5000.fr:24</a>&gt;<br>
<a href="http://stremi-2.reims.grid5000.fr:24" target="_blank">stremi-2.reims.grid5000.fr:24</a> &lt;<a href="http://stremi-2.reims.grid5000.fr:24" target="_blank">http://stremi-2.reims.<u></u>grid5000.fr:24</a>&gt;<br>
<a href="http://stremi-40.reims.grid5000.fr:24" target="_blank">stremi-40.reims.grid5000.fr:24</a> &lt;<a href="http://stremi-40.reims.grid5000.fr:24" target="_blank">http://stremi-40.reims.<u></u>grid5000.fr:24</a>&gt;<br>

<a href="http://stremi-21.reims.grid5000.fr:24" target="_blank">stremi-21.reims.grid5000.fr:24</a> &lt;<a href="http://stremi-21.reims.grid5000.fr:24" target="_blank">http://stremi-21.reims.<u></u>grid5000.fr:24</a>&gt;<br>

<a href="http://stremi-5.reims.grid5000.fr:24" target="_blank">stremi-5.reims.grid5000.fr:24</a> &lt;<a href="http://stremi-5.reims.grid5000.fr:24" target="_blank">http://stremi-5.reims.<u></u>grid5000.fr:24</a>&gt;<br>
<a href="http://stremi-25.reims.grid5000.fr:24" target="_blank">stremi-25.reims.grid5000.fr:24</a> &lt;<a href="http://stremi-25.reims.grid5000.fr:24" target="_blank">http://stremi-25.reims.<u></u>grid5000.fr:24</a>&gt;<div class="im">
<br>
<br>
The configure of mpich2 is just default configure.<br>
<br>
Thanks<br>
Teng<br>
<br>
On Tue, Aug 2, 2011 at 12:43 PM, Pavan Balaji &lt;<a href="mailto:balaji@mcs.anl.gov" target="_blank">balaji@mcs.anl.gov</a><br></div><div class="im">
&lt;mailto:<a href="mailto:balaji@mcs.anl.gov" target="_blank">balaji@mcs.anl.gov</a>&gt;&gt; wrote:<br>
<br>
<br>
    mpiexec -binding rr<br>
<br>
      -- Pavan<br>
<br>
<br>
    On 08/02/2011 11:35 AM, teng ma wrote:<br>
<br>
        If I want to do a process-core binding like MVAPICH2&#39;s scatter way:<br>
        assign MPI ranks by nodes in host file, e.g.<br>
        host1<br>
        host2<br>
        host3<br>
<br>
        rank 0 host 1&#39;s core 0<br>
        rank 1 host 2&#39;s core 0<br>
        rank 2 host 3&#39;s core 0<br>
        rank 3 host 1&#39;s core 1<br>
        rank 4 host 2&#39;s core 1<br>
        rank 5 host 3&#39;s core 1<br>
<br>
        Is there any easy method in mpich2-1.4 to achieve this binding?<br>
<br>
        Teng Ma<br>
<br>
<br>
<br>
        ______________________________<u></u>___________________<br>
        mpich-discuss mailing list<br></div>
        <a href="mailto:mpich-discuss@mcs.anl.gov" target="_blank">mpich-discuss@mcs.anl.gov</a> &lt;mailto:<a href="mailto:mpich-discuss@mcs.anl.gov" target="_blank">mpich-discuss@mcs.anl.<u></u>gov</a>&gt;<div class="im">
<br>
        <a href="https://lists.mcs.anl.gov/__mailman/listinfo/mpich-discuss" target="_blank">https://lists.mcs.anl.gov/__<u></u>mailman/listinfo/mpich-discuss</a><br>
        &lt;<a href="https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss" target="_blank">https://lists.mcs.anl.gov/<u></u>mailman/listinfo/mpich-discuss</a><u></u>&gt;<br>
<br>
<br>
    --<br>
    Pavan Balaji<br></div>
    <a href="http://www.mcs.anl.gov/%7Ebalaji" target="_blank">http://www.mcs.anl.gov/~balaji</a> &lt;<a href="http://www.mcs.anl.gov/%7Ebalaji" target="_blank">http://www.mcs.anl.gov/%<u></u>7Ebalaji</a>&gt;<br>
<br>
<br>
</blockquote><div><div></div><div class="h5">
<br>
-- <br>
Pavan Balaji<br>
<a href="http://www.mcs.anl.gov/%7Ebalaji" target="_blank">http://www.mcs.anl.gov/~balaji</a><br>
</div></div></blockquote></div><br>