<div dir="ltr"><div dir="ltr"><div>Hi, Chris,</div><div>  I did have an MR <a href="https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/7651__;!!G_uCfscf7eWS!fWNsAOkuZRiMn1TuiZ0HasNdskk5heIHlt3O4unVNFd3mnPlFFPISeieHQ_DFsrasG1dwtpASUuFiR6eUOugJNvoDVDy$">https://gitlab.com/petsc/petsc/-/merge_requests/7651</a> to improve mpistream.  I should rework it after Barry's !6903.  See my inlined comments to your questions</div><div><br></div></div><div class="gmail_quote gmail_quote_container"><div dir="ltr" class="gmail_attr">On Fri, Oct 17, 2025 at 3:37 AM Klaij, Christiaan via petsc-users <<a href="mailto:petsc-users@mcs.anl.gov">petsc-users@mcs.anl.gov</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div><div>Attached is a petsc streams result kindly provided by a hardware<br>vendor for a single compute node, dual socket, with two AMD epyc<br>9355 processors. Each processor has 32 cores, 12 DDR5 memory<br>channels and mem BW around 600 GB/s.<br><br>* It is not immediately clear which line corresponds to which<br>  y-axis. Could future versions of petsc please color the axis<br>  label with the matching line color?</div></div></blockquote><div>definitely </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div><div> <br><br>* Why would the achieved bandwidth be roughly 0.9 x 1e6 MB/s =<br>  900 GB/s and not closer to 1200 GB/s?</div></div></blockquote><div>I recall it is actually not simple to get the theoretical max bandwidth. One has to use special SIMD instructions, compiler flags and streaming stores etc. </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div><div> <br><br>* The speed-up seems to be 12 out of 64, provided multiples of 8<br>  cores are used. As expected given 12 memory channels?<br></div></div></blockquote><div>Maybe not, otherwise the speedup should be 24 as you have 24 channels.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div><div><br>* Does the zig-zag pattern indicate a pinning problem, or is it<br>  unavoidable given the 8 core building block of these type of<br>  processors?<br></div></div></blockquote><div>I checked and found "make mpistream" uses --map-by core.  I think we should use --map-by socket or --map-by l3cache.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div><div><br>Chris</div><div dir="ltr" style="font-size:1px;direction:ltr"><table dir="ltr" cellpadding="0" cellspacing="0" border="0" style="width:100%;direction:ltr;border-collapse:collapse;font-size:1px"><tbody><tr style="font-size:1px"><td align="left" style="vertical-align:top;font-size:0px"><table cellpadding="0" cellspacing="0" border="0" style="border-collapse:collapse;font-size:0px;line-height:normal"><tbody><tr style="font-size:0px"><td align="left" style="padding:10px 0px;vertical-align:top"><img src="cid:ii_199f2a38566119b24a61" width="125" height="40" border="0" alt="" style="width: 125px; min-width: 125px; max-width: 125px; height: 40px; min-height: 40px; max-height: 40px; font-size: 0px;"></td></tr></tbody></table></td><td><span style="font-family:remialcxesans;font-size:1px;color:rgb(255,255,255);line-height:1px"><span style="font-family:template-zjzHWwipEfCqpwAiSIGong"></span><span style="font-family:zone-1"></span><span style="font-family:zones-AQ"></span></span></td></tr><tr style="font-size:0px"><td align="left" style="vertical-align:top"><table cellpadding="0" cellspacing="0" border="0" style="border-collapse:collapse;font-size:0px"><tbody><tr style="font-size:0px"><td align="left" style="vertical-align:top"><table cellpadding="0" cellspacing="0" border="0" style="border-collapse:collapse;font-size:0px;color:rgb(0,0,1);font-style:normal;font-weight:400;white-space:nowrap"><tbody><tr style="font-size:14.67px"><td align="left" style="vertical-align:top;font-family:Calibri,Arial,sans-serif">dr. ir. </td><td align="left" style="vertical-align:top;font-family:Calibri,Arial,sans-serif">Christiaan</td><td align="left" style="vertical-align:top;font-family:Calibri,Arial,sans-serif"> Klaij</td><td align="left" style="vertical-align:top;font-family:Calibri,Arial,sans-serif"> | </td><td align="left" style="vertical-align:top;font-family:Calibri,Arial,sans-serif">senior researcher</td><td align="left" style="vertical-align:top;font-size:0px"></td></tr></tbody></table></td></tr></tbody></table></td></tr><tr style="font-size:0px"><td align="left" style="vertical-align:top"><table cellpadding="0" cellspacing="0" border="0" style="border-collapse:collapse;font-size:0px;color:rgb(0,0,1);font-style:normal;font-weight:400;white-space:nowrap"><tbody><tr style="font-size:14.67px"><td align="left" style="vertical-align:top;font-family:Calibri,Arial,sans-serif">Research & Development</td><td align="left" style="vertical-align:top;font-family:Calibri,Arial,sans-serif"> | </td><td align="left" style="vertical-align:top;font-family:Calibri,Arial,sans-serif">CFD Development</td></tr></tbody></table></td></tr><tr style="font-size:0px"><td align="left" style="vertical-align:top"><table cellpadding="0" cellspacing="0" border="0" style="border-collapse:collapse;font-size:0px"><tbody><tr style="font-size:0px"><td align="left" style="vertical-align:top"><table cellpadding="0" cellspacing="0" border="0" style="border-collapse:collapse;font-size:0px;color:rgb(0,0,1);font-style:normal;font-weight:400;white-space:nowrap"><tbody><tr style="font-size:14.67px"><td align="left" style="vertical-align:top;font-family:Calibri,Arial,sans-serif">T <a href="tel:+31%20317%2049%2033%2044" id="m_5830809379424670318LPlnk689713" style="text-decoration:none;color:rgb(0,0,1)" target="_blank">+31 317 49 33 44</a></td><td align="left" style="vertical-align:top;font-family:Calibri,Arial,sans-serif"> | </td><td align="left" style="vertical-align:top;font-family:Calibri,Arial,sans-serif"><a href="https://urldefense.us/v3/__https://www.marin.nl/__;!!G_uCfscf7eWS!fqSBpN3Ld5fjzXGShGI09uJke12M-5LukEHe-y-gw0Bw9msZeH7wNiId6DZxQpluR_RUWpuoQWUD2HSsBfp_z4A$" id="m_5830809379424670318LPlnk689713" style="text-decoration:none;color:rgb(0,0,1)" target="_blank">www.marin.nl</a></td></tr></tbody></table></td></tr></tbody></table></td></tr><tr style="font-size:0px"><td align="left" style="vertical-align:top"><table cellpadding="0" cellspacing="0" border="0" style="border-collapse:collapse;font-size:0px"><tbody><tr style="font-size:0px"><td align="left" style="padding:5px 0px 0px;vertical-align:top"><table cellpadding="0" cellspacing="0" border="0" style="border-collapse:collapse;font-size:0px"><tbody><tr style="font-size:0px"><td align="left" style="vertical-align:top"><table cellpadding="0" cellspacing="0" border="0" style="border-collapse:collapse;font-size:0px;line-height:normal"><tbody><tr style="font-size:0px"><td align="left" style="padding:0px 3px 3px 0px;vertical-align:top"><a href="https://urldefense.us/v3/__https://www.facebook.com/marin.wageningen__;!!G_uCfscf7eWS!fqSBpN3Ld5fjzXGShGI09uJke12M-5LukEHe-y-gw0Bw9msZeH7wNiId6DZxQpluR_RUWpuoQWUD2HSsCH7BGfA$" id="m_5830809379424670318LPlnk689713" style="text-decoration:none" target="_blank"><img src="cid:ii_199f2a3856687e75fd32" width="15" height="15" border="0" title="Facebook" alt="Facebook" style="width: 15px; min-width: 15px; max-width: 15px; height: 15px; min-height: 15px; max-height: 15px; font-size: 12px;"></a></td></tr></tbody></table></td><td align="left" style="vertical-align:top"><table cellpadding="0" cellspacing="0" border="0" style="border-collapse:collapse;font-size:0px;line-height:normal"><tbody><tr style="font-size:0px"><td align="left" style="padding:0px 3px 3px 0px;vertical-align:top"><a href="https://urldefense.us/v3/__https://www.linkedin.com/company/marin__;!!G_uCfscf7eWS!fqSBpN3Ld5fjzXGShGI09uJke12M-5LukEHe-y-gw0Bw9msZeH7wNiId6DZxQpluR_RUWpuoQWUD2HSsDAV2fAI$" id="m_5830809379424670318LPlnk689713" style="text-decoration:none" target="_blank"><img src="cid:ii_199f2a38566acadd6763" width="15" height="15" border="0" title="LinkedIn" alt="LinkedIn" style="width: 15px; min-width: 15px; max-width: 15px; height: 15px; min-height: 15px; max-height: 15px; font-size: 12px;"></a></td></tr></tbody></table></td><td align="left" style="vertical-align:top"><table cellpadding="0" cellspacing="0" border="0" style="border-collapse:collapse;font-size:0px;line-height:normal"><tbody><tr style="font-size:0px"><td align="left" style="padding:0px 3px 3px 0px;vertical-align:top"><a href="https://urldefense.us/v3/__https://www.youtube.com/marinmultimedia__;!!G_uCfscf7eWS!fqSBpN3Ld5fjzXGShGI09uJke12M-5LukEHe-y-gw0Bw9msZeH7wNiId6DZxQpluR_RUWpuoQWUD2HSsEyu_yEs$" id="m_5830809379424670318LPlnk689713" style="text-decoration:none" target="_blank"><img src="cid:ii_199f2a385662bf4cb554" width="15" height="15" border="0" title="YouTube" alt="YouTube" style="width: 15px; min-width: 15px; max-width: 15px; height: 15px; min-height: 15px; max-height: 15px; font-size: 12px;"></a></td></tr></tbody></table></td></tr></tbody></table></td></tr></tbody></table></td></tr></tbody></table></div></div></blockquote></div></div>