cpu-architecture

How does direct mapped cache work?

人盡茶涼 提交于 2019-11-28 03:23:13
I am taking a System Architecture course and I have trouble understanding how a direct mapped cache works. I have looked in several places and they explain it in a different manner which gets me even more confused. What I cannot understand is what is the Tag and Index, and how are they selected? The explanation from my lecture is: "Address divided is into two parts index (e.g 15 bits) used to address (32k) RAMs directly Rest of address, tag is stored and compared with incoming tag. " Where does that tag come from? It cannot be the full address of the memory location in RAM since it renders

How does an assembly instruction turn into voltage changes on the CPU?

若如初见. 提交于 2019-11-28 03:06:27
I've been working in C and CPython for the past 3 - 5 years. Consider that my base of knowledge here. If I were to use an assembly instruction such as MOV AL, 61h to a processor that supported it, what exactly is inside the processor that interprets this code and dispatches it as voltage signals? How would such a simple instruction likely be carried out? Assembly even feels like a high level language when I try to think of the multitude of steps contained in MOV AL, 61h or even XOR EAX, EBX . EDIT: I read a few comments asking why I put this as embedded when the x86-family is not common in

How do SMP cores, processes, and threads work together exactly?

半腔热情 提交于 2019-11-28 03:06:11
On a single core CPU, each process runs in the OS, and the CPU jumps around from one process to another to best utilize itself. A process can have many threads, in which case the CPU runs through these threads when it is running on the respective process. Now, on a multiple core CPU: Do the cores run in every process together, or can the cores run separately in different processes at one particular point of time? For instance, you have program A running two threads. Can a dual core CPU run both threads of this program? I think the answer should be yes if we are using something like OpenMP .

Dependent loads reordering in CPU

痞子三分冷 提交于 2019-11-28 01:14:51
I have been reading Memory Barriers: A Hardware View For Software Hackers , a very popular article by Paul E. McKenney. One of the things the paper highlights is that, very weakly ordered processors like Alpha, can reorder dependent loads which seems to be a side effect of partitioned cache Snippet from the paper: 1 struct el *insert(long key, long data) 2 { 3 struct el *p; 4 p = kmalloc(sizeof(*p), GPF_ATOMIC); 5 spin_lock(&mutex); 6 p->next = head.next; 7 p->key = key; 8 p->data = data; 9 smp_wmb(); 10 head.next = p; 11 spin_unlock(&mutex); 12 } 13 14 struct el *search(long key) 15 { 16

How does 32-bit address 4GB if 2^32 bits = 4 Billion bits not Bytes?

半世苍凉 提交于 2019-11-28 00:28:31
Essentially, how does 4Gb turn into 4GB? If the memory is addressing Bytes, should not the possibilities be 2 (32/8) ? It depends on how you address the data. If you use 32 bits to address each bit , you can address 2 32 bits or 4Gb = 512MB. If you address bytes like most current architectures it will give you 4GB. But if you address much larger blocks you will need less bits to address 4GB. For example if you address each 512-byte block (2^9 bytes) you can address 4GB with 23 bits. FAT16 uses 16 bits to address (maximum) 64KB clusters and therefore can address a maximum 4GB volume. The same

Associativity gives us parallelizability. But what does commutativity give?

和自甴很熟 提交于 2019-11-27 22:52:05
Alexander Stepanov notes in one of his brilliant lectures at A9 (highly recommended, by the way) that the associative property gives us parallelizability – an extremely useful and important trait these days that the compilers, CPUs and programmers themselves can leverage: // expressions in parentheses can be done in parallel // because matrix multiplication is associative Matrix X = (A * B) * (C * D); But what, if anything, does the commutative property give us? Reordering? Out of order execution? Some architectures, x86 being a prime example, have instructions where one of the sources is also

Does an x86 CPU reorder instructions?

怎甘沉沦 提交于 2019-11-27 22:34:08
I have read that some CPUs reorder instructions, but this is not a problem for single threaded programs (the instructions would still be reordered in single threaded programs, but it would appear as if the instructions were executed in order), it is only a problem for multithreaded programs. To solve the problem of instructions reordering, we can insert memory barriers in the appropriate places in the code. But does an x86 CPU reorder instructions? If it does not, then there is no need to use memory barriers, right? Reordering Yes, all modern x86 chips from Intel and AMD aggressively reorder

On what architectures is calculating invalid pointers unsafe?

非 Y 不嫁゛ 提交于 2019-11-27 22:17:37
问题 int* a = new int[5] - 1; This line by itself invokes undefined behavior according to the C++ standard because a is an invalid pointer and not one-past-the-end. At the same time this is a zero overhead way of making a 1-based array (first element is a[1]) which I need for a project of mine. I'm wondering if this is something that I need to avoid or if the C++ standard is just being conservative to support some bizarre architectures that my code is never going to run on anyway. So the question

Setup targeting both x86 and x64?

孤人 提交于 2019-11-27 21:17:27
I have a program that requires both x64 and x86 dlls (it figures out which ones it needs at run time), but when trying to create a setup, it complains: File AlphaVSS.WinXP.x64.dll' targeting 'AMD64' is not compatible with th project's target platform 'x86' File AlphaVSS.Win2003.x64.dll' targeting 'AMD64' is not compatible with th project's target platform 'x86' File AlphaVSS.Win2008.x64.dll' targeting 'AMD64' is not compatible with th project's target platform 'x86' How can I make my setup target both platforms like my program does? The MSI created by the setup project (in Visual Studio) can

How instructions are differentiated from data?

[亡魂溺海] 提交于 2019-11-27 20:42:13
While reading ARM core document, I got this doubt. How does the CPU differentiate the read data from data bus, whether to execute it as an instruction or as a data that it can operate upon? Refer to the excerpt from the document - "Data enters the processor core through the Data bus. The data may be an instruction to execute or a data item." Thanks in advance for enlightening me! /MS Each opcode will consist of an instruction of N bytes, which then expects the subsequent M bytes to be data (memory pointers etc.). So the CPU uses each opcode to determine how manyof the following bytes are data.