Intel CPU Cache Policy

后端 未结 2 1425
难免孤独
难免孤独 2021-01-05 18:05

I have a laptop with an Intel(R) Core(TM) i5-2450M CPU @ 2.50GHz processor. I\'m on Ubuntu 12.04 (x86_64) and I\'m trying to find some info about my processor.

I wa

相关标签:
2条回答
  • 2021-01-05 18:50

    This is not something you can query from CPUID or such, nor can you configure your CPU to do one or the other, thus there exists no tool for querying. What you can query is the cache associativity, the cache line size, and the cache size, for example via /proc/cpuinfo.

    All Intel-compatible CPUs during the last one/two decades used a write-back strategy for caches (which presumes fetching a cache line first to allow partial writes). Of course that's the theory, reality is slighly more complex than that.

    Virtually all processors (your model included) have one or several forms of write combining (or fill buffers as Intel calls it since Merom), and all but the most antique Intel-compatible CPUs support uncached writes from SSE registers (which again uses a form of write-combining). And then of course, there are things like on-chip cache coherence protocols and snoop filtering and other mechanisms to ensure cache coherency both between cores of one processor and between different processors in a multi-processor system.
    Nevertheless -- the general cache policy is still write-back.

    0 讨论(0)
  • 2021-01-05 18:57

    David Kanter's very nice Intel Sandybridge writeup covers the memory subsystem and cache architecture: L1D is the usual-for-Intel write-back, and the per-core L2 is also write-back. So is L3 (which is a large inclusive cache shared by all cores on the chip).

    See also Which cache mapping technique is used in intel core i7 processor? for lots more detail about various generations of Intel CPUs.


    AMD takes a very different approach: Their L1 cache is write-through, but with a tiny 4k write-combining-cache. Constantly rewriting a buffer larger than 4k on AMD will bottleneck on the (slow) L2 instead of L1.

    One of the posters in that thread on Agner's blog claims that BD's L2 is also write-through, but Paul Clayton's comments on this answer disagrees. (I'm inclined to believe Paul.)

    AMD Ryzen fortunately uses a normal write-back 32kiB 8-way L1D, with private write-back 512kiB L2. L3 is a shared 8MB victim cache. It's write-back, but victim-cache means data only enters it when evicted from L1/L2, not directly for loads / prefetches. Each core-cluster (CCX module) of 4 cores has its own 8MB L3, and latency/bandwidth between cores in different clusters is bad.

    There's much more to say about a cache hierarchy than just write-back vs. write-through, although most of the differences don't matter for single-threaded programs. (Unless the OS's process scheduler moves them between clusters on Ryzen, in which case it's bad.)


    On my SnB system:

    sudo dmidecode
    

    produces output which includes:

    Handle 0x0005, DMI type 7, 19 bytes
    Cache Information
            Socket Designation: L1-Cache
            Configuration: Enabled, Not Socketed, Level 1
            Operational Mode: Write Back
            Location: Internal
            Installed Size: 32 kB
            Maximum Size: 32 kB
            Supported SRAM Types:
                    Other
            Installed SRAM Type: Other
            Speed: Unknown
            Error Correction Type: None
            System Type: Unified
            Associativity: 8-way Set-associative
    

    So the fact that the cache is Write-Back is at least in the BIOS, if that's trustworthy. I'm curious what it shows on an AMD CPU, or if BIOS writers tend to just "make something up" and sometimes put the wrong value there.

    As this question points out, info for L2 is kinda bogus: it totals the private 256k-per-core L2:

    Handle 0x0006, DMI type 7, 19 bytes
    Cache Information
            Socket Designation: L2-Cache
            Configuration: Enabled, Not Socketed, Level 2
            Operational Mode: Varies With Memory Address
            Location: Internal
            Installed Size: 1024 kB
            Maximum Size: 1024 kB
            Supported SRAM Types:
                    Other
            Installed SRAM Type: Other
            Speed: Unknown
            Error Correction Type: None
            System Type: Unified
            Associativity: 8-way Set-associative
    
    Handle 0x0007, DMI type 7, 19 bytes
    Cache Information
            Socket Designation: L3-Cache
            Configuration: Enabled, Not Socketed, Level 3
            Operational Mode: Unknown
            Location: Internal
            Installed Size: 6144 kB
            Maximum Size: 6144 kB
            Supported SRAM Types:
                    Other
            Installed SRAM Type: Other
            Speed: Unknown
            Error Correction Type: None
            System Type: Unified
            Associativity: Other
    

    This is on an i5-2500k (quad core SnB with 6MiB of L3)

    0 讨论(0)
提交回复
热议问题