[petsc-users] Understanding the memory bandwidth

Justin Chang jychang48 at gmail.com
Mon Aug 17 13:21:51 CDT 2015


Thanks everyone for your valuable input, a few follow up questions:

1) The specs for my machine says there are 10 cores and 20 threads.
Does that mean for each socket, i have 10 cores where each core has 2
threads? Or does it mean that each core can use up to 20 threads? Or
something else entirely?

2a) When I do an hwloc-info on a single compute node:

$ hwloc-info

depth 0: 1 Machine (type #1)

 depth 1: 2 NUMANode (type #2)

  depth 2: 2 Socket (type #3)

   depth 3: 2 L3Cache (type #4)

    depth 4: 20 L2Cache (type #4)

     depth 5: 20 L1dCache (type #4)

      depth 6: 20 L1iCache (type #4)

       depth 7: 20 Core (type #5)

        depth 8: 20 PU (type #6)

Special depth -3: 5 Bridge (type #9)

Special depth -4: 6 PCI Device (type #10)

Special depth -5: 6 OS Device (type #11)

With this setup, does it mean that if I invoke mpiexec.hydra -np
<number> -bind-to hwthread ... the MPI program will bind to the cores?

2b) Our headnode has 40 PU at depth 8, so if I -bind-to hwthread on
this node (and get yelled at by the system admins) it's possible that
two MPI processes can run on the same core?

3) When I invoke an MPI process via mpiexec.hydra -np <number> ...
without any bindings, do we know what exactly is going on?

Thanks,
Justin

On Fri, Aug 14, 2015 at 2:29 AM, Åsmund Ervik <asmund.ervik at ntnu.no> wrote:
>>> So this is a pretty low fraction (55%) of 59.7*2 = 119.4.  I suspect
>>> your memory or motherboard is at most 1600 MHz, so your peak would be
>>> 102.4 GB/s.
>>
>>> You can check this as root using "dmidecode --type 17", which should
>>> give one entry per channel, looking something like this:
>>>
>>> Handle 0x002B, DMI type 17, 34 bytes
>>> Memory Device
>>>         Array Handle: 0x002A
>>>         Error Information Handle: 0x002F
>>>         Total Width: Unknown
>>>         Data Width: Unknown
>>>         Size: 4096 MB
>>>         Form Factor: DIMM
>>>         Set: None
>>>         Locator: DIMM0
>>>         Bank Locator: BANK 0
>>>         Type: <OUT OF SPEC>
>>>         Type Detail: None
>>>         Speed: Unknown
>>>         Manufacturer: Not Specified
>>>         Serial Number: Not Specified
>>>         Asset Tag: Unknown
>>>         Part Number: Not Specified
>>>         Rank: Unknown
>>>         Configured Clock Speed: 1600 MHz
>>
>>I have no root access. Is there another way to confirm the clock speed?
>
> Also note: even in the case where your motherboard, RAM and CPU all say
> 1866 on the label, if there are more memory DIMMs (chips) per node than
> channels, say 16 DIMMs on your 8 channels, you will see a performance
> reduction on the order of 20-30%. This is more likely if you are using
> nodes in a "high-memory queue" or similar where there's >= 128 GB memory
> per node. (This will change in the future when/if people start using
> DDR4 LRDIMMs.) There's a series of in-depth discussions here:
> http://frankdenneman.nl/2015/02/20/memory-deep-dive/ and there's also
> lots of interesting memory-stuff on John McCalpin's blog:
> https://sites.utexas.edu/jdm4372/
>
> Regards,
> Åsmund
>


More information about the petsc-users mailing list