cpu-architecture | 易学教程

Why does the lw instruction's second argument take in both an offset and regSource?

阅读更多关于 Why does the lw instruction's second argument take in both an offset and regSource?

问题 So the lw instruction is in the following format: lw RegDest, Offset(RegSource) . Why does the second argument take in both an offset and register source? Why not only one (i.e. only register source)? 回答1: Because what else are you going to do with the rest of the 32-bit instruction word? (Assuming you're the CPU architect designing the MIPS instruction set). Leaving out the 16-bit immediate displacement can't make the instruction shorter, because MIPS is a RISC with fixed-length instruction

Running my generated .jar yields “Can't load this .dll (machine code=0xbd) on a AMD 64-bit platform”

阅读更多关于 Running my generated .jar yields “Can't load this .dll (machine code=0xbd) on a AMD 64-bit platform”

问题 I have a project which is using some native libraries (.dll). I'm using Netbeans, and I've specified java.library.path in the run configuration. Running the project from Netbeans yields no errors. I'm using Maven, and when building the project my jar is built in the target folder. I'm copying all the .dlls and my program's dependencies to target/lib with maven resources and dependency plugins. When I try to run my application "outside" of Netbeans, I get the following error: Can't load this

How is the bootstrap processor (BSP) selected on Intel ring and mesh architectures

阅读更多关于 How is the bootstrap processor (BSP) selected on Intel ring and mesh architectures

问题 Section 2.13.2 mentions that the arbitration ID is used to determine which processor issues the no-op cycle first and I have seen this on multiple sources and the intel manual. The intel manual that references the MP initialisation sequence only addresses Pentium 4 when when there was a 'system bus' and before that there was originally an 'APIC bus'. I am under the impression that arbitration ID was only needed in those architectures where multiple cpus shared the same bus. But now, with the

What kind of address instruction does the x86 cpu have?

阅读更多关于 What kind of address instruction does the x86 cpu have?

问题 I learned about one address, two address, and three address instruction, but now I'd like to know, what kind of address instruction does x86 use? 回答1: x86 is a register machine, where at most 1 operand for any instruction can be an explicit memory address instead of a register, using an addressing mode like [rdi + rax*4] . (There are instruction which can have 2 memory operands with one or both being implicit, though: What x86 instructions take two (or more) memory operands?) Typical x86

How is Carry Flag set when subtrahend is larger?

阅读更多关于 How is Carry Flag set when subtrahend is larger?

问题 I know the Carry flag during SUB is set whenever the minuend is smaller than the subtrahend and a borrow is required, but haven't been able to find anything explaining this in more detail. Since subtraction is actually just adding with two's complement, how does the CPU know that the subtrahend is larger and a borrow has occurred? The only thing I can think of is that maybe the Carry flag is set automatically during SUB , whenever converting the subtrahend to its 2's complement. Then unless

Detect -xarch option in the preprocessor?

阅读更多关于 Detect -xarch option in the preprocessor?

问题 I'm using Sun Studio 12.4 and 12.5 on Solaris 11. We have a source file that provides a straight C/C++ implementation of CRC32, or an optimized version of CRC32 using Intel intrinsics. At runtime, a function pointer is populated with the proper implementation. Testing on a x86 server with dual Xeon's is producing the following because we are making code paths available based on compiler versions. SunCC 12.1 added support for SSE4 (if I parsed the matrix properly), so we attempt to enable it

Do you expect that future CPU generations are not cache coherent?

阅读更多关于 Do you expect that future CPU generations are not cache coherent?

问题 I'm designing a program and i found that assuming implicit cache coherency make the design much much easier. For example my single writer (always the same thread) multiple reader (always other threads) scenarios are not using any mutexes. It's not a problem for current Intel CPU's. But i want this program to generate income for at least the next ten years (a short time for software) so i wonder if you think this could be a problem for future cpu architectures. 回答1: I suspect that future CPU

How does branch prediction interact with the instruction pointer

阅读更多关于 How does branch prediction interact with the instruction pointer

问题 It's my understanding that at the beginning of a processor's pipeline, the instruction pointer (which points to the address of the next instruction to execute) is updated by the branch predictor after fetching, so that this new address can then be fetched on the next cycle. However, if the instruction pointer is modified early on in the pipeline, wouldn't this affect instructions currently in the execute phase that might rely on the old instruction pointer value? For instance, when doing a

Unexpected lower access time in multiple process scenario as compared to single process scenario

阅读更多关于 Unexpected lower access time in multiple process scenario as compared to single process scenario

问题 I am accessing a shared library (shared array data structure)from program1 and find the access time to read all elements of that array. I got around 17000 ticks while only Program1 executed alone. Now when I execute program2 (having empty while loop to hold it from termination) in another tab first , then run program1 and measure the access time to read all elements of that array. To my surprise now I am getting 8000ticks as compared to previous scenario where only Program1 executing. It

Is there any architecture that uses the same register space for scalar integer and floating point operations?

阅读更多关于 Is there any architecture that uses the same register space for scalar integer and floating point operations?

问题 Most architectures I've seen that support native scalar hardware FP support shove them off into a completely separate register space, separate from the main set of registers. Most architectures I've seen that support native scalar hardware FP support shove them off into a completely separate register space, separate from the main set of registers. X86's legacy x87 FPU uses a partially separate floating-point "stack machine" (read: basically a fixed-size 8-item ring buffer) with registers st(0