cpu-cache

Minimum associativity for a PIPT L1 cache to also be VIPT, accessing a set without translating the index to physical

元气小坏坏 提交于 2021-02-04 07:31:28
问题 This question comes in context of a section on virtual memory in an undergraduate computer architecture course. Neither the teaching assistants nor the professor were able to answer it sufficiently, and online resources are limited. Question: Suppose a processor with the following specifications: 8KB pages 32-bit virtual addresses 28-bit physical addresses a two-level page table, with a 1KB page table at the first level, and 8KB page tables at the second level 4-byte page table entries a 16

Which part of the computer manages cache replacement?

人走茶凉 提交于 2021-01-29 08:20:40
问题 I haven't found a clear answer: does the control unit itself fetch pre-defined instructions to execute a cache eviction, or does the operating system intervene? If so, how? 回答1: Which part of the computer manages cache replacement? Typically; a cache manages cache replacement itself (its not done by a separate part). There are many types of caches where some are implemented by software (DNS cache, web page cache, file data cache) and some are implemented in hardware (instruction caches, data

Can we use non-temporal mov instructions on heap memory?

天涯浪子 提交于 2021-01-28 05:08:27
问题 In Agner Fog's "Optimizing subroutines in assembly language - section 11.8 Cache control instructions," he says: "Memory writes are more expensive than reads when cache misses occur in a write-back cache. A whole cache line has to be read from memory, modified, and written back in case of a cache miss. This can be avoided by using the non-temporal write instructions MOVNTI, MOVNTQ, MOVNTDQ, MOVNTPD, MOVNTPS . These instructions should be used when writing to a memory location that is unlikely

Unexpected periodic behaviour of an ultra low latency hard real time multi-threaded x86 code

安稳与你 提交于 2021-01-21 11:22:14
问题 I am running code in a loop for multiple iterations on a dedicated CPU with RT priority and want to observe its behaviour over a long time. I found a very strange periodic behaviour of the code. Briefly, this is what the code does: Arraythread { while(1) { if(flag) Multiply matrix record time; reset flag; } } mainthread { for(30 mins) { set flag; record time; busy while(500 μs) } } Here are the details about the machine I am using: CPU: Intel(R) Xeon(R) Gold 6230 CPU @ 2.10 GHz L1 cache: 32K

Unexpected periodic behaviour of an ultra low latency hard real time multi-threaded x86 code

我与影子孤独终老i 提交于 2021-01-21 11:20:38
问题 I am running code in a loop for multiple iterations on a dedicated CPU with RT priority and want to observe its behaviour over a long time. I found a very strange periodic behaviour of the code. Briefly, this is what the code does: Arraythread { while(1) { if(flag) Multiply matrix record time; reset flag; } } mainthread { for(30 mins) { set flag; record time; busy while(500 μs) } } Here are the details about the machine I am using: CPU: Intel(R) Xeon(R) Gold 6230 CPU @ 2.10 GHz L1 cache: 32K

Unexpected periodic behaviour of an ultra low latency hard real time multi-threaded x86 code

a 夏天 提交于 2021-01-21 11:19:03
问题 I am running code in a loop for multiple iterations on a dedicated CPU with RT priority and want to observe its behaviour over a long time. I found a very strange periodic behaviour of the code. Briefly, this is what the code does: Arraythread { while(1) { if(flag) Multiply matrix record time; reset flag; } } mainthread { for(30 mins) { set flag; record time; busy while(500 μs) } } Here are the details about the machine I am using: CPU: Intel(R) Xeon(R) Gold 6230 CPU @ 2.10 GHz L1 cache: 32K

Is there a way to flush the entire CPU cache related to a program?

荒凉一梦 提交于 2020-12-29 12:08:36
问题 On x86-64 platforms, the CLFLUSH assembly instruction allows to flush the cache line corresponding to a given address. Instead of flushing the cache related to a specific address, would there be a way to flush the entire cache (either the cache related to the program being executed, or the entire cache), for example by making it full of dummy contents (or any other approach I would not be aware of): using only standard C++17? using standard C++17 and compiler intrinsics if necessary? What

Is there a way to flush the entire CPU cache related to a program?

亡梦爱人 提交于 2020-12-29 12:00:17
问题 On x86-64 platforms, the CLFLUSH assembly instruction allows to flush the cache line corresponding to a given address. Instead of flushing the cache related to a specific address, would there be a way to flush the entire cache (either the cache related to the program being executed, or the entire cache), for example by making it full of dummy contents (or any other approach I would not be aware of): using only standard C++17? using standard C++17 and compiler intrinsics if necessary? What

MSI: Why do we need to write the line back when other CPU is going to override it?

帅比萌擦擦* 提交于 2020-12-12 11:51:53
问题 In the book "Computer Architecture", by Hennessy/Patterson, 5th ed, on page 360 they describe MSI protocol, and write something like: If the line is in state "Exclusive" (Modified), then on receiving "Write Miss" from the bus the current CPU 1) writes back the line into the bus, and then 2) goes into "Invalid" state. Why do we need to write-back the line, if it will be overwritten anyway by the successive write by the other CPU? Is it connected with the fact that every CPU should see the same

MSI: Why do we need to write the line back when other CPU is going to override it?

 ̄綄美尐妖づ 提交于 2020-12-12 11:51:28
问题 In the book "Computer Architecture", by Hennessy/Patterson, 5th ed, on page 360 they describe MSI protocol, and write something like: If the line is in state "Exclusive" (Modified), then on receiving "Write Miss" from the bus the current CPU 1) writes back the line into the bus, and then 2) goes into "Invalid" state. Why do we need to write-back the line, if it will be overwritten anyway by the successive write by the other CPU? Is it connected with the fact that every CPU should see the same