[petsc-users] Understanding streams test on AMD EPYC 7502

Jed Brown jed at jedbrown.org
Fri Apr 16 22:49:59 CDT 2021


Blaise A Bourdin <bourdin at lsu.edu> writes:

> Thanks for the reference timing. I can use this to talk to the vendor (or switch vendor…).
>
> I am on a 2 socket system. It looks like the node the vendor built for me has 4 DIMMS, possibly all connected to the same socket?
>
> [amduser at gigi ~]$ numactl -H
> available: 2 nodes (0-1)
> node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95
> node 0 size: 257877 MB
> node 0 free: 225820 MB
> node 1 cpus: 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127
> node 1 size: 0 MB
> node 1 free: 0 MB
> node distances:
> node   0   1
>   0:  10  32
>   1:  32  10

lstopo can tell you how much memory is configured on each NUMA node, but the above shows that they're using NPS1 and only populated one socket.

https://developer.amd.com/wp-content/resources/56338_1.00_pub.pdf


I'd recommend asking for NPS4 and ensure that each socket has 8 channels of DDR4-3200. The 2P system should have a total of 16 channels. Mine has 16x16GB, which is two DIMMs per NUMA node in NPS4.

$ numactl -H
available: 8 nodes (0-7)
node 0 cpus: 0 1 2 3 4 5 6 7
node 0 size: 32071 MB
node 0 free: 19457 MB
node 1 cpus: 8 9 10 11 12 13 14 15
node 1 size: 32253 MB
node 1 free: 25034 MB
node 2 cpus: 16 17 18 19 20 21 22 23
node 2 size: 32253 MB
node 2 free: 26204 MB
node 3 cpus: 24 25 26 27 28 29 30 31
node 3 size: 32241 MB
node 3 free: 23922 MB
node 4 cpus: 32 33 34 35 36 37 38 39
node 4 size: 32253 MB
node 4 free: 26791 MB
node 5 cpus: 40 41 42 43 44 45 46 47
node 5 size: 32253 MB
node 5 free: 26216 MB
node 6 cpus: 48 49 50 51 52 53 54 55
node 6 size: 32231 MB
node 6 free: 20094 MB
node 7 cpus: 56 57 58 59 60 61 62 63
node 7 size: 32252 MB
node 7 free: 24965 MB
node distances:
node   0   1   2   3   4   5   6   7
  0:  10  12  12  12  32  32  32  32
  1:  12  10  12  12  32  32  32  32
  2:  12  12  10  12  32  32  32  32
  3:  12  12  12  10  32  32  32  32
  4:  32  32  32  32  10  12  12  12
  5:  32  32  32  32  12  10  12  12
  6:  32  32  32  32  12  12  10  12
  7:  32  32  32  32  12  12  12  10


You have hyperthreading on, while I have it off. I don't know which is better for your workload. I haven't bothered to experiment, but it shouldn't hurt much if you're pinning to core and not oversubscribing.


More information about the petsc-users mailing list