cpu-cache | 易学教程

Minimum associativity for a PIPT L1 cache to also be VIPT, accessing a set without translating the index to physical

阅读更多关于 Minimum associativity for a PIPT L1 cache to also be VIPT, accessing a set without translating the index to physical

问题 This question comes in context of a section on virtual memory in an undergraduate computer architecture course. Neither the teaching assistants nor the professor were able to answer it sufficiently, and online resources are limited. Question: Suppose a processor with the following specifications: 8KB pages 32-bit virtual addresses 28-bit physical addresses a two-level page table, with a 1KB page table at the first level, and 8KB page tables at the second level 4-byte page table entries a 16

Which part of the computer manages cache replacement?

阅读更多关于 Which part of the computer manages cache replacement?

问题 I haven't found a clear answer: does the control unit itself fetch pre-defined instructions to execute a cache eviction, or does the operating system intervene? If so, how? 回答1: Which part of the computer manages cache replacement? Typically; a cache manages cache replacement itself (its not done by a separate part). There are many types of caches where some are implemented by software (DNS cache, web page cache, file data cache) and some are implemented in hardware (instruction caches, data

Can we use non-temporal mov instructions on heap memory?

阅读更多关于 Can we use non-temporal mov instructions on heap memory?

问题 In Agner Fog's "Optimizing subroutines in assembly language - section 11.8 Cache control instructions," he says: "Memory writes are more expensive than reads when cache misses occur in a write-back cache. A whole cache line has to be read from memory, modified, and written back in case of a cache miss. This can be avoided by using the non-temporal write instructions MOVNTI, MOVNTQ, MOVNTDQ, MOVNTPD, MOVNTPS . These instructions should be used when writing to a memory location that is unlikely

Unexpected periodic behaviour of an ultra low latency hard real time multi-threaded x86 code

阅读更多关于 Unexpected periodic behaviour of an ultra low latency hard real time multi-threaded x86 code

问题 I am running code in a loop for multiple iterations on a dedicated CPU with RT priority and want to observe its behaviour over a long time. I found a very strange periodic behaviour of the code. Briefly, this is what the code does: Arraythread { while(1) { if(flag) Multiply matrix record time; reset flag; } } mainthread { for(30 mins) { set flag; record time; busy while(500 μs) } } Here are the details about the machine I am using: CPU: Intel(R) Xeon(R) Gold 6230 CPU @ 2.10 GHz L1 cache: 32K

Unexpected periodic behaviour of an ultra low latency hard real time multi-threaded x86 code

阅读更多关于 Unexpected periodic behaviour of an ultra low latency hard real time multi-threaded x86 code

Unexpected periodic behaviour of an ultra low latency hard real time multi-threaded x86 code

阅读更多关于 Unexpected periodic behaviour of an ultra low latency hard real time multi-threaded x86 code

Is there a way to flush the entire CPU cache related to a program?

阅读更多关于 Is there a way to flush the entire CPU cache related to a program?

问题 On x86-64 platforms, the CLFLUSH assembly instruction allows to flush the cache line corresponding to a given address. Instead of flushing the cache related to a specific address, would there be a way to flush the entire cache (either the cache related to the program being executed, or the entire cache), for example by making it full of dummy contents (or any other approach I would not be aware of): using only standard C++17? using standard C++17 and compiler intrinsics if necessary? What

Is there a way to flush the entire CPU cache related to a program?

阅读更多关于 Is there a way to flush the entire CPU cache related to a program?

MSI: Why do we need to write the line back when other CPU is going to override it?

阅读更多关于 MSI: Why do we need to write the line back when other CPU is going to override it?

问题 In the book "Computer Architecture", by Hennessy/Patterson, 5th ed, on page 360 they describe MSI protocol, and write something like: If the line is in state "Exclusive" (Modified), then on receiving "Write Miss" from the bus the current CPU 1) writes back the line into the bus, and then 2) goes into "Invalid" state. Why do we need to write-back the line, if it will be overwritten anyway by the successive write by the other CPU? Is it connected with the fact that every CPU should see the same

MSI: Why do we need to write the line back when other CPU is going to override it?

阅读更多关于 MSI: Why do we need to write the line back when other CPU is going to override it?