<html><head><meta http-equiv="Content-Type" content="text/html; charset=utf-8" /></head><body><div>Hi Junchao,<br /><br />Thanks for you answer. Regarding the speed-up what would you expect if not 24 out of 64, and why?<br /><br />Chris<br /><br />________________________________________<div dir="ltr" style="mso-line-height-rule:exactly;-webkit-text-size-adjust:100%;font-size:1px;direction:ltr;"><table dir="ltr" cellpadding="0" cellspacing="0" border="0" style="width:100%;direction:ltr;border-collapse:collapse;font-size:1px;"><tr style="font-size:1px;"><td align="left" style="vertical-align:top;font-size:0;"><table cellpadding="0" cellspacing="0" border="0" style="border-collapse:collapse;font-size:0;line-height:normal;"><tr style="font-size:0;"><td align="left" style="padding:10px 0;vertical-align:top;"><img src="cid:image014564.png@EB4D2A47.4E37BD2D" width="125" height="40" border="0" alt="" style="width:125px;min-width:125px;max-width:125px;height:40px;min-height:40px;max-height:40px;font-size:0;" /></td></tr></table></td><td><span style="font-family:remialcxesans;font-size:1px;color:#FFFFFF;line-height:1px;"><span style="font-family:'template-zjzHWwipEfCqpwAiSIGong';"></span><span style="font-family:'zone-1';"></span><span style="font-family:'zones-AQ';"></span></span></td></tr><tr style="font-size:0;"><td align="left" style="vertical-align:top;"><table cellpadding="0" cellspacing="0" border="0" style="border-collapse:collapse;font-size:0;"><tr style="font-size:0;"><td align="left" style="vertical-align:top;"><table cellpadding="0" cellspacing="0" border="0" style="border-collapse:collapse;font-size:0;color:#000001;font-style:normal;font-weight:400;white-space:nowrap;"><tr style="font-size:14.67px;"><td align="left" style="vertical-align:top;font-family:Calibri,Arial,sans-serif;">dr. ir. </td><td align="left" style="vertical-align:top;font-family:Calibri,Arial,sans-serif;">Christiaan</td><td align="left" style="vertical-align:top;font-family:Calibri,Arial,sans-serif;"> Klaij</td><td align="left" style="vertical-align:top;font-family:Calibri,Arial,sans-serif;"> | </td><td align="left" style="vertical-align:top;font-family:Calibri,Arial,sans-serif;">senior researcher</td><td align="left" style="vertical-align:top;font-size:0;"></td></tr></table></td></tr></table></td></tr><tr style="font-size:0;"><td align="left" style="vertical-align:top;"><table cellpadding="0" cellspacing="0" border="0" style="border-collapse:collapse;font-size:0;color:#000001;font-style:normal;font-weight:400;white-space:nowrap;"><tr style="font-size:14.67px;"><td align="left" style="vertical-align:top;font-family:Calibri,Arial,sans-serif;">Research & Development</td><td align="left" style="vertical-align:top;font-family:Calibri,Arial,sans-serif;"> | </td><td align="left" style="vertical-align:top;font-family:Calibri,Arial,sans-serif;">CFD Development</td></tr></table></td></tr><tr style="font-size:0;"><td align="left" style="vertical-align:top;"><table cellpadding="0" cellspacing="0" border="0" style="border-collapse:collapse;font-size:0;"><tr style="font-size:0;"><td align="left" style="vertical-align:top;"><table cellpadding="0" cellspacing="0" border="0" style="border-collapse:collapse;font-size:0;color:#000001;font-style:normal;font-weight:400;white-space:nowrap;"><tr style="font-size:14.67px;"><td align="left" style="vertical-align:top;font-family:Calibri,Arial,sans-serif;">T <a href="tel:+31%20317%2049%2033%2044" target="_blank" id="LPlnk689713" style="text-decoration:none;color:#000001;">+31 317 49 33 44</a></td><td align="left" style="vertical-align:top;font-family:Calibri,Arial,sans-serif;"> | </td><td align="left" style="vertical-align:top;font-family:Calibri,Arial,sans-serif;"><a href="https://urldefense.us/v3/__https://www.marin.nl/__;!!G_uCfscf7eWS!dkRjTO35gYlTQIMwhRteR45CyztokJhS-tZqqfhmLbel4doCt4smq3sAWssIeXAtdh9w2ffm5zooSLNTiaEBn58$" target="_blank" id="LPlnk689713" style="text-decoration:none;color:#000001;">www.marin.nl</a></td></tr></table></td></tr></table></td></tr><tr style="font-size:0;"><td align="left" style="vertical-align:top;"><table cellpadding="0" cellspacing="0" border="0" style="border-collapse:collapse;font-size:0;"><tr style="font-size:0;"><td align="left" style="padding:5px 0 0;vertical-align:top;"><table cellpadding="0" cellspacing="0" border="0" style="border-collapse:collapse;font-size:0;"><tr style="font-size:0;"><td align="left" style="vertical-align:top;"><table cellpadding="0" cellspacing="0" border="0" style="border-collapse:collapse;font-size:0;line-height:normal;"><tr style="font-size:0;"><td align="left" style="padding:0 3px 3px 0;vertical-align:top;"><a href="https://urldefense.us/v3/__https://www.facebook.com/marin.wageningen__;!!G_uCfscf7eWS!dkRjTO35gYlTQIMwhRteR45CyztokJhS-tZqqfhmLbel4doCt4smq3sAWssIeXAtdh9w2ffm5zooSLNTp73O-B8$" target="_blank" id="LPlnk689713" style="text-decoration:none;"><img src="cid:image076541.png@2D6B252C.8270F00B" width="15" height="15" border="0" title="Facebook" alt="Facebook" style="width:15px;min-width:15px;max-width:15px;height:15px;min-height:15px;max-height:15px;font-size:12px;" /></a></td></tr></table></td><td align="left" style="vertical-align:top;"><table cellpadding="0" cellspacing="0" border="0" style="border-collapse:collapse;font-size:0;line-height:normal;"><tr style="font-size:0;"><td align="left" style="padding:0 3px 3px 0;vertical-align:top;"><a href="https://urldefense.us/v3/__https://www.linkedin.com/company/marin__;!!G_uCfscf7eWS!dkRjTO35gYlTQIMwhRteR45CyztokJhS-tZqqfhmLbel4doCt4smq3sAWssIeXAtdh9w2ffm5zooSLNTLUB729U$" target="_blank" id="LPlnk689713" style="text-decoration:none;"><img src="cid:image600589.png@F0A4126A.3FBD481B" width="15" height="15" border="0" title="LinkedIn" alt="LinkedIn" style="width:15px;min-width:15px;max-width:15px;height:15px;min-height:15px;max-height:15px;font-size:12px;" /></a></td></tr></table></td><td align="left" style="vertical-align:top;"><table cellpadding="0" cellspacing="0" border="0" style="border-collapse:collapse;font-size:0;line-height:normal;"><tr style="font-size:0;"><td align="left" style="padding:0 3px 3px 0;vertical-align:top;"><a href="https://urldefense.us/v3/__https://www.youtube.com/marinmultimedia__;!!G_uCfscf7eWS!dkRjTO35gYlTQIMwhRteR45CyztokJhS-tZqqfhmLbel4doCt4smq3sAWssIeXAtdh9w2ffm5zooSLNTH8agBD4$" target="_blank" id="LPlnk689713" style="text-decoration:none;"><img src="cid:image758659.png@16DB2C6F.1463FCDC" width="15" height="15" border="0" title="YouTube" alt="YouTube" style="width:15px;min-width:15px;max-width:15px;height:15px;min-height:15px;max-height:15px;font-size:12px;" /></a></td></tr></table></td></tr></table></td></tr></table></td></tr></table></div><br />From: Junchao Zhang <junchao.zhang@gmail.com><br />Sent: Friday, October 17, 2025 5:01 PM<br />To: Klaij, Christiaan<br />Cc: PETSc users list<br />Subject: Re: [petsc-users] interpreting petsc streams result<br /><br />Hi, Chris,<br /> I did have an MR <a href="https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/7651__;!!G_uCfscf7eWS!dkRjTO35gYlTQIMwhRteR45CyztokJhS-tZqqfhmLbel4doCt4smq3sAWssIeXAtdh9w2ffm5zooSLNTuKitErU$">https://gitlab.com/petsc/petsc/-/merge_requests/7651</a> to improve mpistream. I should rework it after Barry's !6903. See my inlined comments to your questions<br /><br />On Fri, Oct 17, 2025 at 3:37 AM Klaij, Christiaan via petsc-users <petsc-users@mcs.anl.gov<mailto:petsc-users@mcs.anl.gov>> wrote:<br />Attached is a petsc streams result kindly provided by a hardware<br />vendor for a single compute node, dual socket, with two AMD epyc<br />9355 processors. Each processor has 32 cores, 12 DDR5 memory<br />channels and mem BW around 600 GB/s.<br /><br />* It is not immediately clear which line corresponds to which<br />y-axis. Could future versions of petsc please color the axis<br />label with the matching line color?<br />definitely<br /><br /><br />* Why would the achieved bandwidth be roughly 0.9 x 1e6 MB/s =<br />900 GB/s and not closer to 1200 GB/s?<br />I recall it is actually not simple to get the theoretical max bandwidth. One has to use special SIMD instructions, compiler flags and streaming stores etc.<br /><br /><br />* The speed-up seems to be 12 out of 64, provided multiples of 8<br />cores are used. As expected given 12 memory channels?<br />Maybe not, otherwise the speedup should be 24 as you have 24 channels.<br /><br /><br />* Does the zig-zag pattern indicate a pinning problem, or is it<br />unavoidable given the 8 core building block of these type of<br />processors?<br />I checked and found "make mpistream" uses --map-by core. I think we should use --map-by socket or --map-by l3cache.<br /><br /><br />Chris<br />[cid:ii_199f2a38566119b24a61]<br />dr. ir. Christiaan Klaij | senior researcher<br />Research & Development | CFD Development<br />T +31 317 49 33 44<tel:+31%20317%2049%2033%2044> | <a href="https://urldefense.us/v3/__http://www.marin.nl__;!!G_uCfscf7eWS!dkRjTO35gYlTQIMwhRteR45CyztokJhS-tZqqfhmLbel4doCt4smq3sAWssIeXAtdh9w2ffm5zooSLNT-LxhuP4$">www.marin.nl</a><https://urldefense.us/v3/__https://www.marin.nl/__;!!G_uCfscf7eWS!fqSBpN3Ld5fjzXGShGI09uJke12M-5LukEHe-y-gw0Bw9msZeH7wNiId6DZxQpluR_RUWpuoQWUD2HSsBfp_z4A$><br />[Facebook]<https://urldefense.us/v3/__https://www.facebook.com/marin.wageningen__;!!G_uCfscf7eWS!fqSBpN3Ld5fjzXGShGI09uJke12M-5LukEHe-y-gw0Bw9msZeH7wNiId6DZxQpluR_RUWpuoQWUD2HSsCH7BGfA$><br />[LinkedIn]<https://urldefense.us/v3/__https://www.linkedin.com/company/marin__;!!G_uCfscf7eWS!fqSBpN3Ld5fjzXGShGI09uJke12M-5LukEHe-y-gw0Bw9msZeH7wNiId6DZxQpluR_RUWpuoQWUD2HSsDAV2fAI$><br />[YouTube]<https://urldefense.us/v3/__https://www.youtube.com/marinmultimedia__;!!G_uCfscf7eWS!fqSBpN3Ld5fjzXGShGI09uJke12M-5LukEHe-y-gw0Bw9msZeH7wNiId6DZxQpluR_RUWpuoQWUD2HSsEyu_yEs$><br /></div></body></html>