What is a “cache-friendly” code?

前端 未结 9 2196
误落风尘
误落风尘 2020-11-22 02:16

What is the difference between \"cache unfriendly code\" and the \"cache friendly\" code?

How can I make sure I write cache-efficie

9条回答
  •  谎友^
    谎友^ (楼主)
    2020-11-22 03:05

    Just piling on: the classic example of cache-unfriendly versus cache-friendly code is the "cache blocking" of matrix multiply.

    Naive matrix multiply looks like:

    for(i=0;i

    If N is large, e.g. if N * sizeof(elemType) is greater than the cache size, then every single access to src2[k][j] will be a cache miss.

    There are many different ways of optimizing this for a cache. Here's a very simple example: instead of reading one item per cache line in the inner loop, use all of the items:

    int itemsPerCacheLine = CacheLineSize / sizeof(elemType);
    
    for(i=0;i

    If the cache line size is 64 bytes, and we are operating on 32 bit (4 byte) floats, then there are 16 items per cache line. And the number of cache misses via just this simple transformation is reduced approximately 16-fold.

    Fancier transformations operate on 2D tiles, optimize for multiple caches (L1, L2, TLB), and so on.

    Some results of googling "cache blocking":

    http://stumptown.cc.gt.atl.ga.us/cse6230-hpcta-fa11/slides/11a-matmul-goto.pdf

    http://software.intel.com/en-us/articles/cache-blocking-techniques

    A nice video animation of an optimized cache blocking algorithm.

    http://www.youtube.com/watch?v=IFWgwGMMrh0

    Loop tiling is very closely related:

    http://en.wikipedia.org/wiki/Loop_tiling

提交回复
热议问题