cpu-architecture | 易学教程

On what architectures is calculating invalid pointers unsafe?

阅读更多关于 On what architectures is calculating invalid pointers unsafe?

int* a = new int[5] - 1; This line by itself invokes undefined behavior according to the C++ standard because a is an invalid pointer and not one-past-the-end. At the same time this is a zero overhead way of making a 1-based array (first element is a[1]) which I need for a project of mine . I'm wondering if this is something that I need to avoid or if the C++ standard is just being conservative to support some bizarre architectures that my code is never going to run on anyway. So the question is, on what architectures will this be a problem? Are any of those widespread? Edit: To see that the

Difference between armeabi and armeabi-v7a

阅读更多关于 Difference between armeabi and armeabi-v7a

问题 As far as I can tell from the docs, the difference between the two supported flavors of ARM architecture in Android NDK is only in the set of supported CPU instructions. Is that really so? Is there no difference in calling conventions, or system call sequence, or something else? I'm wondering what will happen if I compile a module to an ARM object file (with a compiler other than NDK - Free Pascal specifically), specifying ARMv6 as the architecture, and then link it to both armeabi and

Why were bitwise operations slightly faster than addition/subtraction operations on older microprocessors?

阅读更多关于 Why were bitwise operations slightly faster than addition/subtraction operations on older microprocessors?

问题 I came across this excerpt today: On most older microprocessors, bitwise operations are slightly faster than addition and subtraction operations and usually significantly faster than multiplication and division operations. On modern architectures, this is not the case: bitwise operations are generally the same speed as addition (though still faster than multiplication). I'm curious about why bitwise operations were slightly faster than addition/subtraction operations on older microprocessors.

What branch misprediction does the Branch Target Buffer detect?

阅读更多关于 What branch misprediction does the Branch Target Buffer detect?

I am currently looking at the various parts of the CPU pipeline which can detect branch mispredictions. I have found these are: Branch Target Buffer (BPU CLEAR) Branch Address Calculator (BA CLEAR) Jump Execution Unit (not sure of the signal name here??) I know what 2 and 3 detect, but I do not understand what misprediction is detected within the BTB. The BAC detects where the BTB has erroneously predicted a branch for a non-branch instruction, where the BTB has failed to detect a branch, or the BTB has mispredicted the target address for a x86 RET instruction. The execution unit evaluates the

Cache bandwidth per tick for modern CPUs

阅读更多关于 Cache bandwidth per tick for modern CPUs

What is a speed of cache accessing for modern CPUs? How many bytes can be read or written from memory every processor clock tick by Intel P4, Core2, Corei7, AMD? Please, answer with both theoretical (width of ld/sd unit with its throughput in uOPs/tick) and practical numbers (even memcpy speed tests, or STREAM benchmark), if any. PS it is question, related to maximal rate of load/store instructions in assembler. There can be theoretical rate of loading (all Instructions Per Tick are widest loads), but processor can give only part of such, a practical limit of loading. osgx For nehalem: rolfed

What's the purpose of the rotate instructions (ROL, RCL on x86)?

阅读更多关于 What's the purpose of the rotate instructions (ROL, RCL on x86)?

I always wondered what's the purpose of the rotate instructions some CPUs have (ROL, RCL on x86, for example). What kind of software makes use of these instructions? I first thought they may be used for encryption/computing hash codes, but these libraries are written usually in C, which doesn't have operators that map to these instructions. Has anybody found an use for them? Why where they added to the instructions set? Rotates are required for bit shifts across multiple words. When you SHL the lower word, the high-order bit spills out into the carry. To complete the operation, you need to

What is the difference between x64 and IA-64?

阅读更多关于 What is the difference between x64 and IA-64?

I was on Microsoft's website and noticed two different installers, one for x64 and one for IA-64. Reference: Installing the .NET Framework 4.5, 4.5.1 My understanding is that IA-64 is a subclass of x64, so I'm curious why it would have a separate installer. x64 is used as a short term for the 64 bit extensions of the "classical" x86 architecture; almost any "normal" PC produced in the last years have a processor based on such architecture. AMD invented the AMD64 extensions; Intel was more or less forced to implement them, and called them first IA-32e, then EM64T and finally Intel 64 (actually,

Difference between physical addressing and virtual addressing concept

阅读更多关于 Difference between physical addressing and virtual addressing concept

This is a re-submission, because I am not getting any response from superuser.com. Sorry for the misunderstanding. I need to know the difference between physical addressing and virtual addressing concept in embedded systems. Why virtual addressing concept is implemented in embedded systems? What is the advantage of the virtual addressing over a system with physical addressing concept in embedded systems? How the mapping between virtual addressing to physical addressing is done in embedded systems? Please, explain the above concept with some simple examples in some simple architecture. Physical

Determine target ISA extensions of binary file in Linux (library or executable)

阅读更多关于 Determine target ISA extensions of binary file in Linux (library or executable)

问题 We have an issue related to a Java application running under a (rather old) FC3 on an Advantech POS board with a Via C3 processor. The java application has several compiled shared libs that are accessed via JNI. Via C3 processor is supposed to be i686 compatible. Some time ago after installing Ubuntu 6.10 on a MiniItx board with the same processor, I found out that the previous statement is not 100% true. The Ubuntu kernel hanged on startup due to the lack of some specific and optional

Parallel programming using Haswell architecture [closed]

阅读更多关于 Parallel programming using Haswell architecture [closed]

I want to learn about parallel programming using Intel's Haswell CPU microarchitecture. About using SIMD: SSE4.2, AVX2 in asm/C/C++/(any other langs)?. Can you recommend books, tutorials, internet resources, courses? Thanks! Z boson It sounds to me like you need to learn about parallel programming in general on the CPU. I started looking into this about 10 months ago before I ever used SSE, OpenMP, or intrinsics so let me give a brief summary of some important concepts I have learned and some useful resources. There are several parallel computing technologies that can be employed: MIMD, SIMD,