<div dir="ltr"><div>Hi, Chris,</div><div>  Since we compute the speed up off the bandwidth achieved by a single MPI process, and a process can drive all memory channels,  the maximum speed up can only come from experiments (vs. not by # of memory channels).</div><div><br></div><div>  --Junchao Zhang</div><br></div><br><div class="gmail_quote gmail_quote_container"><div dir="ltr" class="gmail_attr">On Mon, Oct 20, 2025 at 9:45 AM Klaij, Christiaan <<a href="mailto:C.Klaij@marin.nl">C.Klaij@marin.nl</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div><div>Hi Junchao,<br><br>Thanks for you answer. Regarding the speed-up what would you expect if not 24 out of 64, and why?<br><br>Chris<br><br>________________________________________<div dir="ltr" style="font-size:1px;direction:ltr"><table dir="ltr" cellpadding="0" cellspacing="0" border="0" style="width:100%;direction:ltr;border-collapse:collapse;font-size:1px"><tbody><tr style="font-size:1px"><td align="left" style="vertical-align:top;font-size:0px"><table cellpadding="0" cellspacing="0" border="0" style="border-collapse:collapse;font-size:0px;line-height:normal"><tbody><tr style="font-size:0px"><td align="left" style="padding:10px 0px;vertical-align:top"><img src="cid:ii_19a027041d3d825dd561" width="125" height="40" border="0" alt="" style="width: 125px; min-width: 125px; max-width: 125px; height: 40px; min-height: 40px; max-height: 40px; font-size: 0px;"></td></tr></tbody></table></td><td><span style="font-family:remialcxesans;font-size:1px;color:rgb(255,255,255);line-height:1px">​<span style="font-family:template-zjzHWwipEfCqpwAiSIGong">​</span><span style="font-family:zone-1">​</span><span style="font-family:zones-AQ">​</span></span></td></tr><tr style="font-size:0px"><td align="left" style="vertical-align:top"><table cellpadding="0" cellspacing="0" border="0" style="border-collapse:collapse;font-size:0px"><tbody><tr style="font-size:0px"><td align="left" style="vertical-align:top"><table cellpadding="0" cellspacing="0" border="0" style="border-collapse:collapse;font-size:0px;color:rgb(0,0,1);font-style:normal;font-weight:400;white-space:nowrap"><tbody><tr style="font-size:14.67px"><td align="left" style="vertical-align:top;font-family:Calibri,Arial,sans-serif">dr. ir. </td><td align="left" style="vertical-align:top;font-family:Calibri,Arial,sans-serif">Christiaan</td><td align="left" style="vertical-align:top;font-family:Calibri,Arial,sans-serif"> Klaij</td><td align="left" style="vertical-align:top;font-family:Calibri,Arial,sans-serif"> | </td><td align="left" style="vertical-align:top;font-family:Calibri,Arial,sans-serif">senior researcher</td><td align="left" style="vertical-align:top;font-size:0px"></td></tr></tbody></table></td></tr></tbody></table></td></tr><tr style="font-size:0px"><td align="left" style="vertical-align:top"><table cellpadding="0" cellspacing="0" border="0" style="border-collapse:collapse;font-size:0px;color:rgb(0,0,1);font-style:normal;font-weight:400;white-space:nowrap"><tbody><tr style="font-size:14.67px"><td align="left" style="vertical-align:top;font-family:Calibri,Arial,sans-serif">Research & Development</td><td align="left" style="vertical-align:top;font-family:Calibri,Arial,sans-serif"> | </td><td align="left" style="vertical-align:top;font-family:Calibri,Arial,sans-serif">CFD Development</td></tr></tbody></table></td></tr><tr style="font-size:0px"><td align="left" style="vertical-align:top"><table cellpadding="0" cellspacing="0" border="0" style="border-collapse:collapse;font-size:0px"><tbody><tr style="font-size:0px"><td align="left" style="vertical-align:top"><table cellpadding="0" cellspacing="0" border="0" style="border-collapse:collapse;font-size:0px;color:rgb(0,0,1);font-style:normal;font-weight:400;white-space:nowrap"><tbody><tr style="font-size:14.67px"><td align="left" style="vertical-align:top;font-family:Calibri,Arial,sans-serif">T <a href="tel:+31%20317%2049%2033%2044" id="m_7760971645588092298LPlnk689713" style="text-decoration:none;color:rgb(0,0,1)" target="_blank">+31 317 49 33 44</a></td><td align="left" style="vertical-align:top;font-family:Calibri,Arial,sans-serif"> | </td><td align="left" style="vertical-align:top;font-family:Calibri,Arial,sans-serif"><a href="https://urldefense.us/v3/__https://www.marin.nl/__;!!G_uCfscf7eWS!eWQ9V6JC_rYZNcX9e_xTrfUjf7r2ZdoUyieeSwP7mc9QEj97847bdPUYphA8CBkYwpsrwb65pvMYnmpUnOFERYwlKK6G$" id="m_7760971645588092298LPlnk689713" style="text-decoration:none;color:rgb(0,0,1)" target="_blank">www.marin.nl</a></td></tr></tbody></table></td></tr></tbody></table></td></tr><tr style="font-size:0px"><td align="left" style="vertical-align:top"><table cellpadding="0" cellspacing="0" border="0" style="border-collapse:collapse;font-size:0px"><tbody><tr style="font-size:0px"><td align="left" style="padding:5px 0px 0px;vertical-align:top"><table cellpadding="0" cellspacing="0" border="0" style="border-collapse:collapse;font-size:0px"><tbody><tr style="font-size:0px"><td align="left" style="vertical-align:top"><table cellpadding="0" cellspacing="0" border="0" style="border-collapse:collapse;font-size:0px;line-height:normal"><tbody><tr style="font-size:0px"><td align="left" style="padding:0px 3px 3px 0px;vertical-align:top"><a href="https://urldefense.us/v3/__https://www.facebook.com/marin.wageningen__;!!G_uCfscf7eWS!eWQ9V6JC_rYZNcX9e_xTrfUjf7r2ZdoUyieeSwP7mc9QEj97847bdPUYphA8CBkYwpsrwb65pvMYnmpUnOFERT9e7Q2s$" id="m_7760971645588092298LPlnk689713" style="text-decoration:none" target="_blank"><img src="cid:ii_19a027041d31deed9592" width="15" height="15" border="0" title="Facebook" alt="Facebook" style="width: 15px; min-width: 15px; max-width: 15px; height: 15px; min-height: 15px; max-height: 15px; font-size: 12px;"></a></td></tr></tbody></table></td><td align="left" style="vertical-align:top"><table cellpadding="0" cellspacing="0" border="0" style="border-collapse:collapse;font-size:0px;line-height:normal"><tbody><tr style="font-size:0px"><td align="left" style="padding:0px 3px 3px 0px;vertical-align:top"><a href="https://urldefense.us/v3/__https://www.linkedin.com/company/marin__;!!G_uCfscf7eWS!eWQ9V6JC_rYZNcX9e_xTrfUjf7r2ZdoUyieeSwP7mc9QEj97847bdPUYphA8CBkYwpsrwb65pvMYnmpUnOFERUf1DiSy$" id="m_7760971645588092298LPlnk689713" style="text-decoration:none" target="_blank"><img src="cid:ii_19a027041d361c588563" width="15" height="15" border="0" title="LinkedIn" alt="LinkedIn" style="width: 15px; min-width: 15px; max-width: 15px; height: 15px; min-height: 15px; max-height: 15px; font-size: 12px;"></a></td></tr></tbody></table></td><td align="left" style="vertical-align:top"><table cellpadding="0" cellspacing="0" border="0" style="border-collapse:collapse;font-size:0px;line-height:normal"><tbody><tr style="font-size:0px"><td align="left" style="padding:0px 3px 3px 0px;vertical-align:top"><a href="https://urldefense.us/v3/__https://www.youtube.com/marinmultimedia__;!!G_uCfscf7eWS!eWQ9V6JC_rYZNcX9e_xTrfUjf7r2ZdoUyieeSwP7mc9QEj97847bdPUYphA8CBkYwpsrwb65pvMYnmpUnOFERRQd8GVj$" id="m_7760971645588092298LPlnk689713" style="text-decoration:none" target="_blank"><img src="cid:ii_19a027041d3aa363cd64" width="15" height="15" border="0" title="YouTube" alt="YouTube" style="width: 15px; min-width: 15px; max-width: 15px; height: 15px; min-height: 15px; max-height: 15px; font-size: 12px;"></a></td></tr></tbody></table></td></tr></tbody></table></td></tr></tbody></table></td></tr></tbody></table></div><br>From: Junchao Zhang <<a href="mailto:junchao.zhang@gmail.com" target="_blank">junchao.zhang@gmail.com</a>><br>Sent: Friday, October 17, 2025 5:01 PM<br>To: Klaij, Christiaan<br>Cc: PETSc users list<br>Subject: Re: [petsc-users] interpreting petsc streams result<br><br>Hi, Chris,<br>  I did have an MR <a href="https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/7651__;!!G_uCfscf7eWS!eWQ9V6JC_rYZNcX9e_xTrfUjf7r2ZdoUyieeSwP7mc9QEj97847bdPUYphA8CBkYwpsrwb65pvMYnmpUnOFERYIUu7jT$" target="_blank">https://gitlab.com/petsc/petsc/-/merge_requests/7651</a> to improve mpistream.  I should rework it after Barry's !6903.  See my inlined comments to your questions<br><br>On Fri, Oct 17, 2025 at 3:37 AM Klaij, Christiaan via petsc-users <<a href="mailto:petsc-users@mcs.anl.gov" target="_blank">petsc-users@mcs.anl.gov</a><mailto:<a href="mailto:petsc-users@mcs.anl.gov" target="_blank">petsc-users@mcs.anl.gov</a>>> wrote:<br>Attached is a petsc streams result kindly provided by a hardware<br>vendor for a single compute node, dual socket, with two AMD epyc<br>9355 processors. Each processor has 32 cores, 12 DDR5 memory<br>channels and mem BW around 600 GB/s.<br><br>* It is not immediately clear which line corresponds to which<br>y-axis. Could future versions of petsc please color the axis<br>label with the matching line color?<br>definitely<br><br><br>* Why would the achieved bandwidth be roughly 0.9 x 1e6 MB/s =<br>900 GB/s and not closer to 1200 GB/s?<br>I recall it is actually not simple to get the theoretical max bandwidth. One has to use special SIMD instructions, compiler flags and streaming stores etc.<br><br><br>* The speed-up seems to be 12 out of 64, provided multiples of 8<br>cores are used. As expected given 12 memory channels?<br>Maybe not, otherwise the speedup should be 24 as you have 24 channels.<br><br><br>* Does the zig-zag pattern indicate a pinning problem, or is it<br>unavoidable given the 8 core building block of these type of<br>processors?<br>I checked and found "make mpistream" uses --map-by core.  I think we should use --map-by socket or --map-by l3cache.<br><br><br>Chris<br>[cid:ii_199f2a38566119b24a61]<br>dr. ir.         Christiaan       Klaij   |      senior researcher<br>Research & Development   |      CFD Development<br>T +31 317 49 33 44<tel:+31%20317%2049%2033%2044>         |      <a href="https://urldefense.us/v3/__http://www.marin.nl__;!!G_uCfscf7eWS!eWQ9V6JC_rYZNcX9e_xTrfUjf7r2ZdoUyieeSwP7mc9QEj97847bdPUYphA8CBkYwpsrwb65pvMYnmpUnOFERUomu-nz$" target="_blank">www.marin.nl</a><<a href="https://urldefense.us/v3/__https://www.marin.nl/__;!!G_uCfscf7eWS!fqSBpN3Ld5fjzXGShGI09uJke12M-5LukEHe-y-gw0Bw9msZeH7wNiId6DZxQpluR_RUWpuoQWUD2HSsBfp_z4A$" target="_blank">https://urldefense.us/v3/__https://www.marin.nl/__;!!G_uCfscf7eWS!fqSBpN3Ld5fjzXGShGI09uJke12M-5LukEHe-y-gw0Bw9msZeH7wNiId6DZxQpluR_RUWpuoQWUD2HSsBfp_z4A$</a>><br>[Facebook]<<a href="https://urldefense.us/v3/__https://www.facebook.com/marin.wageningen__;!!G_uCfscf7eWS!fqSBpN3Ld5fjzXGShGI09uJke12M-5LukEHe-y-gw0Bw9msZeH7wNiId6DZxQpluR_RUWpuoQWUD2HSsCH7BGfA$" target="_blank">https://urldefense.us/v3/__https://www.facebook.com/marin.wageningen__;!!G_uCfscf7eWS!fqSBpN3Ld5fjzXGShGI09uJke12M-5LukEHe-y-gw0Bw9msZeH7wNiId6DZxQpluR_RUWpuoQWUD2HSsCH7BGfA$</a>><br>[LinkedIn]<<a href="https://urldefense.us/v3/__https://www.linkedin.com/company/marin__;!!G_uCfscf7eWS!fqSBpN3Ld5fjzXGShGI09uJke12M-5LukEHe-y-gw0Bw9msZeH7wNiId6DZxQpluR_RUWpuoQWUD2HSsDAV2fAI$" target="_blank">https://urldefense.us/v3/__https://www.linkedin.com/company/marin__;!!G_uCfscf7eWS!fqSBpN3Ld5fjzXGShGI09uJke12M-5LukEHe-y-gw0Bw9msZeH7wNiId6DZxQpluR_RUWpuoQWUD2HSsDAV2fAI$</a>><br>[YouTube]<<a href="https://urldefense.us/v3/__https://www.youtube.com/marinmultimedia__;!!G_uCfscf7eWS!fqSBpN3Ld5fjzXGShGI09uJke12M-5LukEHe-y-gw0Bw9msZeH7wNiId6DZxQpluR_RUWpuoQWUD2HSsEyu_yEs$" target="_blank">https://urldefense.us/v3/__https://www.youtube.com/marinmultimedia__;!!G_uCfscf7eWS!fqSBpN3Ld5fjzXGShGI09uJke12M-5LukEHe-y-gw0Bw9msZeH7wNiId6DZxQpluR_RUWpuoQWUD2HSsEyu_yEs$</a>><br></div></div></blockquote></div>