<div dir="ltr">Why do I see the max bandwidth of EPYC-7502 is 200GB/s, <a href="https://www.cpu-world.com/CPUs/Zen/AMD-EPYC%207502.html">https://www.cpu-world.com/CPUs/Zen/AMD-EPYC%207502.html</a>?<div><br></div><div>Your bandwidth is around 1/8 of the max. Is it because your machine only has one DIMM, thus only uses one memory channel?</div><div><br clear="all"><div><div dir="ltr" class="gmail_signature" data-smartmail="gmail_signature"><div dir="ltr">--Junchao Zhang</div></div></div><br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Fri, Apr 16, 2021 at 3:27 PM Jed Brown <<a href="mailto:jed@jedbrown.org">jed@jedbrown.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Blaise A Bourdin <<a href="mailto:bourdin@lsu.edu" target="_blank">bourdin@lsu.edu</a>> writes:<br>
<br>
> Hi,<br>
><br>
> I am test-driving hardware for a new machine for my group and having a hard time making sense the output of the stream test:<br>
><br>
> I am attaching the results and my reference (xeon 8260 nodes on QueenBee 3 at LONI).<br>
><br>
> If I understand correctly, on the AMD node, the memory bandwidth is saturated with a single core. Is this expected?<br>
> The comparison is not totally fair in that QB3 uses intel MPI and MPI compilers, whereas the AMD node uses mvapich2, which I compiled with the following options: ./configure --prefix=/home/amduser/Development/mvapich2-2.3.5-gcc9.3 --with-device=ch3:nemesis:tcp --with-rdma=gen2 --enable-cxx --enable-romio --enable-fast=all --enable-g=dbg --enable-shared-libs=gcc --enable-shared<br>
><br>
> Am I doing something wrong on the AMD node?<br>
<br>
It looks like it's oversubscribing some cores rather than spreading them over the node. You should get around 200 GB/s on this node without using streaming instructions (closer to 300 GB/s with those, but it isn't representative of real-world code). Slightly less if you don't have NPS4 activated.<br>
<br>
You can check your MPI docs and use make MPI_BINDING='--bind-to core', for example.<br>
</blockquote></div>