computer-architecture | 易学教程

Interrupt masking: why?

阅读更多关于 Interrupt masking: why?

问题 I was reading up on interrupts. It is possible to suspend non-critical interrupts via a special interrupt mask. This is called interrupt masking. What i dont know is when/why you might want to or need to temporarily suspend interrupts? Possibly Semaphores, or programming in a multi-processor environment? 回答1: The OS does that when it prepares to run its own "let's orchestrate the world" code. For example, at some point the OS thread scheduler has control. It prepares the processor registers

Difference between ISR and Function Call?

阅读更多关于 Difference between ISR and Function Call?

问题 I want to understand difference between ISR (Interrupt Service Routine) and Function call. I feel both the function call and ISR are the same from the hardware perspective. Please Correct me if I am wrong. All I could found about ISR and Function call is as follows: ISR: Asynchronous event that can occur any time during the execution of the program Saves the PC, Flags and registers on the stack and disables all the interrupts and loads the address of the ISR ISR cannot have arguments that can

Cache eviction from L1 cache on L2 eviction

阅读更多关于 Cache eviction from L1 cache on L2 eviction

I have a basic question about the policy followed by the memory system. Consider a core with private L1 and L2 caches. After L2 cache we have a bus on which the coherence traffic runs. Now, if a cache line for address(X) is evicted from the L2 cache, is it necessary to evict that address from the L1 cache ?? The reason for eviction can be that it helps in maintaining the invariant of the coherence protocol [if a line in l2 shows invalid this core does not contain this address]. There are three different designs and all are used. Exclusive: Data in the L1 cache is never in the L2 cache. Data in

Why are conditionally executed instructions not present in later ARM instruction sets?

阅读更多关于 Why are conditionally executed instructions not present in later ARM instruction sets?

Naively, conditionally executed instructions seem like a great idea to me. As I read more about ARM (and ARM-like) instruction sets (Thumb2, Unicore, AArch64) I find that they all lack the bits for conditional execution. Why is conditional execution missing from each of these? Was conditional execution a mistake at the time, or have subsequent changes made it an expensive waste of instruction bits? auselen General claim is modern systems have better branch predictors and compilers are much more advanced so their cost on instruction encoding space is not justified. This is from ARMv8

How is RAM able to acess any place in memory at O(1) speed

阅读更多关于 How is RAM able to acess any place in memory at O(1) speed

We are taught that the abstraction of the RAM memory is a long array of bytes. And that for the CPU it takes the same amount of time to access any part of it. What is the device that has the ability to access any byte out of the 4 gigabytes (on my computer) in the same time? As this does not seem as a trivial task for me. I have asked colleagues and my professors, but nobody can pinpoint to the how this task can be achieved with simple logic gates, and if it isn't just a tricky combination of logic gates, then what is it? My personal guess is that you could achieve the access of any memory in

Write a program to get CPU cache sizes and levels

阅读更多关于 Write a program to get CPU cache sizes and levels

I want to write a program to get my cache size(L1, L2, L3). I know the general idea of it. Allocate a big array Access part of it of different size each time. So I wrote a little program. Here's my code: #include <cstdio> #include <time.h> #include <sys/mman.h> const int KB = 1024; const int MB = 1024 * KB; const int data_size = 32 * MB; const int repeats = 64 * MB; const int steps = 8 * MB; const int times = 8; long long clock_time() { struct timespec tp; clock_gettime(CLOCK_REALTIME, &tp); return (long long)(tp.tv_nsec + (long long)tp.tv_sec * 1000000000ll); } int main() { // allocate memory

How cache memory works?

阅读更多关于 How cache memory works?

Today when I was in computer organization class, teacher talked about something interesting to me. When it comes to talk about Why cache memory works, he said that: for (i=0; i<M; i++) for(j=0; j<N; j++) X[i][j] = X[i][j] + K; //X is double(8 bytes) it is not good to change the first line with the second. What is your opinions on this? And why it is like that? Locality of reference. Because the data is stored by rows, for each row the j columns are in adjacent memory addresses. The OS will typically load an entire page from memory into the cache and adjacent address references will likely

Is there a code that results in 50% branch prediction miss?

阅读更多关于 Is there a code that results in 50% branch prediction miss?

The problem: I'm trying to figure out how to write a code (C preffered, ASM only if there is no other solution) that would make the branch prediction miss in 50% of the cases . So it has to be a piece of code that "is imune" to compiler optimizations related to branching and also all the HW branch prediction should not go better than 50% (tossing a coin). Even a greater challenge is being able to run the code on multiple CPU architectures and get the same 50% miss ratio. I managed to write a code that goes to 47% branch miss ratio on an x86 platform. I'm suspecting the missing could 3% come

Which standard C++ features can be used for querying machine/OS architecture?

阅读更多关于 Which standard C++ features can be used for querying machine/OS architecture?

What are the standard C++ features and utilities for querying the properties of the hardware or operating system capabilities, on which the program is running? For instance, std::thread::hardware_concurrency() gives you the number of threads the machine supports. But how do you detect how much RAM the computer has, or how much RAM the process is using, or how much disk space is available to write to in a certain directory, or how much L2 cache is available? I would prefer answers by means of c++ ( c++14 ) standards, but TR2 or boost proposals would be good as well. As others have pointed out,

What are “non-virtualizable” instructions in x86 architecture?

阅读更多关于 What are “non-virtualizable” instructions in x86 architecture?

Before the advent of hardware assisted virtualization there were instructions that could not be virtualized due to various reasons. Can somebody please explain what those instructions are and why they cannot be virtualized? To virtualize an ISA, certain requirements must be met. Popek and Goldberg used something like the following: A machine has at least two modes (a) user mode and (b) system mode . Typically, applications run in user mode and the operating system runs in system mode . In system mode , the code/program can see and manipulate the machine without restrictions. In user mode , the