cpu-cache

How do data caches route the object in this example?

限于喜欢 提交于 2019-12-31 00:58:10
问题 Consider the diagrammed data cache architecture. (ASCII art follows.) -------------------------------------- | CPU core A | CPU core B | | |------------|------------| Devices | | Cache A1 | Cache B1 | with DMA | |-------------------------| | | Cache 2 | | |------------------------------------| | RAM | -------------------------------------- Suppose that an object is shadowed on a dirty line of Cache A1, an older version of the same object is shadowed on a clean line of Cache 2, and the newest

WC vs WB memory? Other types of memory on x86_64?

假如想象 提交于 2019-12-30 11:32:41
问题 Could you describe the meanings and the differences between WC and WB memory on x86_64? For completeness, please, describe other types of memory on x86_64, if any. 回答1: I will first start with Writeback caching (WB) since it is easier to understand. Writeback caching As the name implies this caching strategy tries to delay the writes to the system memory as long as possible. The idea is to use only the cache, ideally. However, since the cache has a finite size smaller than the finite size of

WC vs WB memory? Other types of memory on x86_64?

不问归期 提交于 2019-12-30 11:31:51
问题 Could you describe the meanings and the differences between WC and WB memory on x86_64? For completeness, please, describe other types of memory on x86_64, if any. 回答1: I will first start with Writeback caching (WB) since it is easier to understand. Writeback caching As the name implies this caching strategy tries to delay the writes to the system memory as long as possible. The idea is to use only the cache, ideally. However, since the cache has a finite size smaller than the finite size of

Interconnect between per-core L2 and L3 in Core i7

和自甴很熟 提交于 2019-12-30 09:58:08
问题 The Intel core i7 has per-core L1 and L2 caches, and a large shared L3 cache. I need to know what kind of an interconnect connects the multiple L2s to the single L3. I am a student, and need to write a rough behavioral model of the cache subsystem. Is it a crossbar? A single bus? a ring? The references I came across mention structural details of the caches, but none of them mention what kind of on-chip interconnect exists. Thanks, -neha 回答1: Modern i7's use a ring. From Tom's Hardware:

Are two consequent CPU stores on x86 flushed to the cache keeping the order?

泄露秘密 提交于 2019-12-30 06:42:05
问题 Assume there are two threads running on x86 CPU0 and CPU1 respectively. Thread running on CPU0 executes the following commands: A=1 B=1 Cache line containing A initially owned by CPU1 and that containing B owned by CPU0. I have two questions: If I understand correctly, both stores will be put into CPU’s store buffer. However, for the first store A=1 the cache of CPU1 must be invalidated while the second store B=1 can be flushed immediately since CPU0 owns the cache line containing it. I know

What is the best NHibernate cache L2 provider?

邮差的信 提交于 2019-12-30 04:56:27
问题 I've seen there is a plenty of them. NCache, Velocity and so forth but I haven't found a table comparing them. What's the best considering the following criterias: Easy to understand. Is being maintained lately. Is free or has a good enough free version. Works. 回答1: I can't speak for what's best or worst, but I'll throw in my experience with NCache in case it helps. Disclaimer: NHibernate and I had some disagreements, we have since gone our separate ways :) The Good The performance was great

What use is the INVD instruction?

假如想象 提交于 2019-12-30 03:43:05
问题 The x86 INVD invalidates the cache hierarchy without writing the contents back to memory, apparently. I'm curious, what use is such an instruction? Given how one has very little control over what data may be in the various cache levels and even less control over what may have already been flushed asynchronously, it seems to be little more than a way to make sure you just don't know what data is held in memory anymore. 回答1: Excellent question! One use-case for such a blunt-acting instruction

WBINVD instruction usage

一世执手 提交于 2019-12-28 16:04:05
问题 I'm trying to use the WBINV instruction on linux to clear the processor's L1 cache. The following program compiles, but produces a segmentation fault when I try to run it. int main() {asm ("wbinvd"); return 1;} I'm using gcc 4.4.3 and run Linux kernel 2.6.32-33 on my x86 box. Processor info: Intel(R) Core(TM)2 Duo CPU T5270 @ 1.40GHz I built the program as follows: $ gcc $ ./a.out Segmentation Fault Can somebody tell me what I'm doing wrong? How do I get this to run? P.S: I'm running a few

C++ cache aware programming

元气小坏坏 提交于 2019-12-28 07:40:19
问题 is there a way in C++ to determine the CPU's cache size? i have an algorithm that processes a lot of data and i'd like to break this data down into chunks such that they fit into the cache. Is this possible? Can you give me any other hints on programming with cache-size in mind (especially in regard to multithreaded/multicore data processing)? Thanks! 回答1: According to "What every programmer should know about memory", by Ulrich Drepper you can do the following on Linux: Once we have a formula

What specifically marks an x86 cache line as dirty - any write, or is an explicit change required?

青春壹個敷衍的年華 提交于 2019-12-28 03:05:27
问题 This question is specifically aimed at modern x86-64 cache coherent architectures - I appreciate the answer can be different on other CPUs. If I write to memory, the MESI protocol requires that the cache line is first read into cache, then modified in the cache (the value is written to the cache line which is then marked dirty). In older write-though micro-architectures, this would then trigger the cache line being flushed, under write-back the cache line being flushed can be delayed for some