How to compute cache bit widths for tags, indices and offsets in a set-associative cache and TLB

问题

Following is the question:

We have memory system with both virtual of 64-bits and physical address of 48-bits. The L1 TLB is fully associative with 64 entries. The page size in virtual memory is 16KB. L1 cache is of 32KB and 2-way set associative, L2 cache is of 2MB and 4-way set associative. Block size of both L1 and L2 cache is 64B. L1 cache is using virtually indexed physically tagged (VIPT) scheme.

We are required to compute tags, indices and offsets. This is the solution that I have formulated so far:

page offset = log base 2(page size)=14bits
Block offset=log base 2 (block size)= 6 bit
virtual page number =virtual address-pageoffset= 64-14= 50 bits
L1 cache index =page offset - block offset= 8 bits
L1 tag=Physical address-L1 index-block offset= 50 bits
TLB index= log base 2 (64/64)=0 bits {since it is fully associative and whole cache can be thought of as one set.}
TLBtag= virtual page number - index=50 bits
L2 cache index= log base 2 ( cache size/ (block size * ways)) 13bits
L2 tag= 21 bits

For reference:

This is the solution that I have calculated.Please tell if wrong. Thanks in advance :)

回答1:

Looks right.

You should really calculate L1D index bits the same way you do for L2: log2(32KiB / (64B * 2)) = log2(256) = 8 bits.

Calculating the L1 index bits as page offset - block offset is only possible because your diagram shows you that your cache has the desirable property that all the index bits are page-offset bits. (So for aliasing behaviour, it's like a PIPT cache: homonyms and synonyms are impossible. So you can get VIPT speed without any of the aliasing downsides of virtual caches.)

So I guess really calculating both ways and checking is a good sanity check. i.e. check that it matches the diagram, or that the diagram matches the other parameters.

It's also not required that L1D index+offset bits "use up" all the page offset bits: e.g. increasing L1D associativity would leave 1 or more page-offset bits as part of the tag. (This is fine, and wouldn't introduce aliasing problems, it just means your L1D isn't as big as it could be for a given associativity and page size.)

It is common to build caches this way, though, especially with smaller page sizes. For example, x86 has 4k pages, and Intel CPUs have used 32kiB / 8-way L1D for over a decade. (32k / 8 = 4k). Making it larger (64kiB) would also require making it 16-way associative, because changing the page size is not an option. This would start to get too expensive for a low-latency high throughput cache with parallel tag + data fetch. Earlier CPUs like Pentium III had 16kiB / 4-way, and they were able to scale that up to 32kiB / 8-way, but I don't think we should expect larger L1D unless something fundamental changes. But with your hypothetical CPU architecture with 16kiB pages, a small+fast L1D with more associativity is certainly plausible. (Your diagram is pretty clear that the index goes all the way up to the page split, but other designs are possible without giving up the VIPT benefits.)

See also Why is the size of L1 cache smaller than that of the L2 cache in most of the processors? for more about the "VIPT hack" and why multi-level caches are necessary to get a combination of low-latency and large capacity in practical designs. (And note that current Intel L1D caches are pipelined and multi-ported (with 2 reads and 1 write per clock) for access widths up to 32 bytes, or even all 64 bytes of a line with AVX512. How can cache be that fast?. So making L1D larger and more highly associative would cost a lot of power.)

来源：https://stackoverflow.com/questions/47747772/how-to-compute-cache-bit-widths-for-tags-indices-and-offsets-in-a-set-associati

标签

computer-science

cpu-architecture

virtual-memory

cpu-cache