cpu-architecture | 易学教程

Cache bandwidth per tick for modern CPUs

阅读更多关于 Cache bandwidth per tick for modern CPUs

问题 What is a speed of cache accessing for modern CPUs? How many bytes can be read or written from memory every processor clock tick by Intel P4, Core2, Corei7, AMD? Please, answer with both theoretical (width of ld/sd unit with its throughput in uOPs/tick) and practical numbers (even memcpy speed tests, or STREAM benchmark), if any. PS it is question, related to maximal rate of load/store instructions in assembler. There can be theoretical rate of loading (all Instructions Per Tick are widest

How does direct mapped cache work?

阅读更多关于 How does direct mapped cache work?

问题 I am taking a System Architecture course and I have trouble understanding how a direct mapped cache works. I have looked in several places and they explain it in a different manner which gets me even more confused. What I cannot understand is what is the Tag and Index, and how are they selected? The explanation from my lecture is: "Address divided is into two parts index (e.g 15 bits) used to address (32k) RAMs directly Rest of address, tag is stored and compared with incoming tag. " Where

How is CPU usage calculated?

阅读更多关于 How is CPU usage calculated?

问题 On my desktop, I have a little widget that tells me my current CPU usage. It also shows the usage for each of my two cores. I always wondered, how does the CPU calculate how much of its processing power is being used? Also, if the CPU is hung up doing some intense calculations, how can it (or whatever handles this activity) examine the usage, without getting hung up as well? 回答1: The CPU doesn't do the usage calculations by itself. It may have hardware features to make that task easier, but

Difference between physical addressing and virtual addressing concept

阅读更多关于 Difference between physical addressing and virtual addressing concept

问题 This is a re-submission, because I am not getting any response from superuser.com. Sorry for the misunderstanding. I need to know the difference between physical addressing and virtual addressing concept in embedded systems. Why virtual addressing concept is implemented in embedded systems? What is the advantage of the virtual addressing over a system with physical addressing concept in embedded systems? How the mapping between virtual addressing to physical addressing is done in embedded

Why is it better to use the ebp than the esp register to locate parameters on the stack?

阅读更多关于 Why is it better to use the ebp than the esp register to locate parameters on the stack?

问题 I am new to MASM. I have confusion regarding these pointer registers. I would really appreciate if you guys help me. Thanks 回答1: Encoding an addressing mode using [ebp + disp8] is one byte shorter than [esp+disp8] , because using ESP as a base register requires a SIB byte. See rbp not allowed as SIB base? for details. (That question title is asking about the fact that [ebp] has to be encoded as [ebp+0] .) The first time [esp + disp8] is used after a push or pop, or after a call , will require

Is it possible to “abort” when loading a register from memory rather the triggering a page fault?

阅读更多关于 Is it possible to “abort” when loading a register from memory rather the triggering a page fault?

问题 I am thinking about 'Minimizing page faults (and TLB faults) while “walking” a large graph' 'How to know whether a pointer is in physical memory or it will trigger a Page Fault?' is a related question looking at the problem from the other side, but does not have a solution. I wish to be able to load some data from memory into a register, but have the load abort rather than getting a page fault, if the memory is currently paged out. I need the code to work in user space on both Windows and

How does MIPS I forward from EX to ID for branches without stalling?

阅读更多关于 How does MIPS I forward from EX to ID for branches without stalling?

问题 addiu $6,$6,5 bltz $6,$L5 nop ... $L5: Is that safe on MIPS I? If so, how? Original MIPS I is a classic 5-stage RISC IF ID EX MEM WB design that hides all of its branch latency with a single branch-delay slot by checking branch conditions early, in the ID stage. (Which is why it's limited to equal/not-equal, or sign-bit checks like lt or ge zero, not lt between two registers that would need carry-propagation through an adder.) Doesn't this mean that branches need their input ready a cycle

How do SMP cores, processes, and threads work together exactly?

阅读更多关于 How do SMP cores, processes, and threads work together exactly?

问题 On a single core CPU, each process runs in the OS, and the CPU jumps around from one process to another to best utilize itself. A process can have many threads, in which case the CPU runs through these threads when it is running on the respective process. Now, on a multiple core CPU: Do the cores run in every process together, or can the cores run separately in different processes at one particular point of time? For instance, you have program A running two threads. Can a dual core CPU run

x86 registers: MBR/MDR and instruction registers

阅读更多关于 x86 registers: MBR/MDR and instruction registers

问题 From what I have read, the IA-32 architecture has ten 32-bit and six 16-bit registers. The 32-bit registers are as follows: Data registers - EAX, EBX, ECX, EDX Pointer registers - EIP, ESP, EBP Index registers - ESI, EDI Control registers - EFLAG (EIP is also classified as a control register) The 16-bit registers are as below: Code Segment: It contains all the instructions to be executed. Data Segment: It contains data, constants and work areas. Stack Segment: It contains data and return

How instructions are differentiated from data?

阅读更多关于 How instructions are differentiated from data?

问题 While reading ARM core document, I got this doubt. How does the CPU differentiate the read data from data bus, whether to execute it as an instruction or as a data that it can operate upon? Refer to the excerpt from the document - "Data enters the processor core through the Data bus. The data may be an instruction to execute or a data item." Thanks in advance for enlightening me! /MS 回答1: Each opcode will consist of an instruction of N bytes, which then expects the subsequent M bytes to be