[petsc-users] Understanding the memory bandwidth
Barry Smith
bsmith at mcs.anl.gov
Thu Aug 13 15:47:30 CDT 2015
> On Aug 13, 2015, at 3:30 PM, Matthew Knepley <knepley at gmail.com> wrote:
>
> On Thu, Aug 13, 2015 at 3:22 PM, Justin Chang <jychang48 at gmail.com> wrote:
> On Thu, Aug 13, 2015 at 1:04 PM, Jed Brown <jed at jedbrown.org> wrote:
> > It looks like with one core/socket, all your memory sits over one
> > channel. You can play tricks to avoid that or use 4 cores/socket in
> > order to use all memory channels.
>
> How do I play these tricks?
>
> > So this is a pretty low fraction (55%) of 59.7*2 = 119.4. I suspect
> > your memory or motherboard is at most 1600 MHz, so your peak would be
> > 102.4 GB/s.
>
> > You can check this as root using "dmidecode --type 17", which should
> > give one entry per channel, looking something like this:
> >
> > Handle 0x002B, DMI type 17, 34 bytes
> > Memory Device
> > Array Handle: 0x002A
> > Error Information Handle: 0x002F
> > Total Width: Unknown
> > Data Width: Unknown
> > Size: 4096 MB
> > Form Factor: DIMM
> > Set: None
> > Locator: DIMM0
> > Bank Locator: BANK 0
> > Type: <OUT OF SPEC>
> > Type Detail: None
> > Speed: Unknown
> > Manufacturer: Not Specified
> > Serial Number: Not Specified
> > Asset Tag: Unknown
> > Part Number: Not Specified
> > Rank: Unknown
> > Configured Clock Speed: 1600 MHz
>
> I have no root access. Is there another way to confirm the clock speed?
>
> ---
>
> So if I have two sockets per node, then the theoretical peak bandwidth
> is actually double than what I thought (whether it be 119.4 GB/s or
> 102.4 GB/s). And if 8 cores really is the optimal number to use for a
> single compute node, why are there 20 totals to begin with? Or would
> this depend on the particular application?
>
> Kind Answer: Different application have different needs
>
> Cynical Answer: Computer companies sell you what they can produce,
> lots of cores, not what you need, lots of bandwidth. Bandwidth is very
> expensive and there are technical limits.
Cost of production of a system may not, is not, simply linearly proportional to the number of cores, or number of floating point units or any other particular feature of a system. For example, maybe a 50 core system costs $50,000 and a 100 core system (everything else being equal) costs $70,000 for a company to make, in a sense each additional core (within reason) costs less so it is acceptable to get less performance out it since the incremental cost is lower.
Barry
>
> Also, can someone elaborate on the difference between the words
> "core", "processor", and "thread"?
>
> A core and a processor are hardware terms. I think they are both fuzzy,
> but I understand a core to be something that can carry a thread of execution,
> namely a program counter, instruction and data stream, and compute something.
> A thread is a logical construct for talking about an execution stream.
>
> Matt
>
> --
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener
More information about the petsc-users
mailing list