How to get memory bandwidth from memory clock/memory speed

我是研究僧i 提交于 2019-11-28 21:41:06

The Titan has a 384bit bus while a GTX 680 only has 256, hence 50% more memory bandwidth (assuming clock and latencies are identical.

Edit: I'll try to explain the whole concept a bit more: the following is a simplified model of the factors that determine the performance of RAM (not only on a graphics cards).

Factor A: Frequency

RAM is running at a clock speed. RAM running at 1 GHz "ticks" 1,000,000,000 (a billion) times a second. With every tick, it can receive or send one bit on every lane. So a theoretical RAM module with only one memory lane running at 1GHz would deliver 1 Gigabit per second, since there are 8 bits to the bytes that means 125 Megabyte per second.

Factor B: "Pump Rate"

DDR-RAM (Double Data Rate) can deliver two bits per tick, and there even are "quad-pumped" buses that deliver four bits per tick, but I haven't heard of the latter being used on graphics cards.

Factor C: Bus width.

RAM doesn't just have one single lane to send data. Even the Intel 4004 had a 4 bit bus. The graphics cards you linked have 256 bus lanes and 384 bus lanes respectively.

All of the above factors are multiplied to calculate the theoretical maximum at which data can be sent or received:

**Maximum throughput in bytes per second= Frequency * Pumprate * BusWidth / 8 **

Now lets do the math for the two graphics cards you linked. They both seem to use the same type of RAM (GDDR5 with a pump rate of 2), both running at 3 GHz.

GTX-680: 3 Gbps * 2 * 256 / 8 = 192 GB/s

GTX-Titan: 3 Gbps * 2 * 384 / 8 = 288 GB/s

Factor D: Latency - or reality kicks in

This factor is a LOT harder to calculate than all of the above combined. Basically, when you tell your RAM "hey, I want this data", it takes a while until it comes up with the answer. This latency depends on a number of things and is really hard to calculate, and usually results in RAM systems delivering way less than their theoretical maxima. This is where all the timings, prefetching and tons of other stuff comes into the picture. Since it's not just numbers that could be used for marketing, where higher numbers translate to "better", the marketing focus is mostly on other stuff. And in case you wondered, that is mostly where GDDR5 differs from the DDR3 you've got on your mainboard.

I think the correct calculation is explained here:
https://www.goldfries.com/computing/gddr3-vs-gddr5-graphic-card-comparison-see-the-difference-with-the-amd-radeon-hd-7750/

In short:
"(Memory clock x Bus Width / 8) * GDDR type multiplier = Bandwidth in GB/s

GDDR type multiplier is 2 for GDDR3, 4 for GDDR5."

There are many more details there, quite well explained and detailed.

From https://www.goldfries.com/computing/gddr3-vs-gddr5-graphic-card-comparison-see-the-difference-with-the-amd-radeon-hd-7750/:

(memory clock in Hz × bus width ÷ 8) × memory clock type multiplier = Bandwidth in MB/s

where memory clock type multiplier is one of the following:

HBM1 / HBM2: 2
GDDR3: 2
GDDR5: 4
GDDR5X: 8

Let's take one of the current top-of-the-line graphics cards at the time of this writing, the GTX 1080 Ti which uses GDDR5X memory. According to techPowerUp!, this card's specifications are:

Memory clock: 1376MHz
Bus width: 352-bit
Memory type: GDDR5X

If we plug these values into the above formula we get:

(1376 * 352 / 8) * 8 = 484 352 MB/s = ~484 GB/s

Similarly for the GTX 1070 which uses older GDDR5 memory:

Memory clock: 2002MHz
Bus width: 256-bit
Memory type: GDDR5

(2002 * 256 / 8) * 4 = 256 256 MB/s = ~256 GB/s

Finally, for the AMD Fury X which uses HBM1:

Memory clock: 500MHz
Bus width: 4096-bit
Memory type: HBM1

(500 * 4096 / 8) * 2 = 512 000 MB/s = 512 GB/s

and the Vega 64 which uses HBM2:

Memory clock: 945MHz
Bus width: 2048-bit
Memory type: HBM2

(945 * 2048 / 8) * 2 = 483 840 MB/s = ~484 GB/s

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!