cpu-architecture | 易学教程

Determining the CPU architecture of a static library (LIB) on Windows

阅读更多关于 Determining the CPU architecture of a static library (LIB) on Windows

问题 I just built libpng on a 64-bit Windows machine using VS2008. It produces a libpng.lib file inside the \projects\visualc71\Win32_Lib_Release directory (Configuration used being "LIB Release"). I used dumpbin to inspect this LIB file: C:\Temp\libpng-1.4.3>dumpbin projects\visualc71\Win32_LIB_Release\libpng.lib Microsoft (R) COFF/PE Dumper Version 9.00.30729.01 Copyright (C) Microsoft Corporation. All rights reserved. Dump of file projects\visualc71\Win32_LIB_Release\libpng.lib File Type:

Why denormalized floats are so much slower than other floats, from hardware architecture viewpoint?

阅读更多关于 Why denormalized floats are so much slower than other floats, from hardware architecture viewpoint?

Denormals are known to underperform severely, 100x or so, compared to normals. This frequently causes unexpected software problems . I'm curious, from CPU Architecture viewpoint, why denormals have to be that much slower? Is the lack of performance is intrinsic to their unfortunate representation? Or maybe CPU architects neglect them to reduce hardware cost under the (mistaken) assumption that denormals don't matter? In the former case, if denormals are intrinsically hardware-unfriendly, are there known non-IEEE-754 floating point representations that are also gapless near zero, but more

Is a memory barrier an instruction that the CPU executes, or is it just a marker?

阅读更多关于 Is a memory barrier an instruction that the CPU executes, or is it just a marker?

I am trying to understand what is a memory barrier exactly. Based on what I know so far, a memory barrier (for example: mfence ) is used to prevent the re-ordering of instructions from before to after and from after to before the memory barrier. This is an example of a memory barrier in use: instruction 1 instruction 2 instruction 3 mfence instruction 4 instruction 5 instruction 6 Now my question is: Is the mfence instruction just a marker telling the CPU in what order to execute the instructions? Or is it an instruction that the CPU actually executes like it executes other instructions (for

Advantages of a 64 bit system

阅读更多关于 Advantages of a 64 bit system

From a developer perspective i am trying to understand , what is the selling point of a 64-bit system ? I understand that more registers are at your disposal , more memory can be allocated to a process , but i cannot understand what makes a developer's life easier. Any examples ? From a performance perspective are there any gains seen if a program is run on a 32bit vs 64 bit ? Cheers! EDIT : Thank you for all your replies. I see some conversations shooting towards end user experience , important as it may be.. I am looking more at any architectural benefits that you can squeeze out. From what

Is there a compiler flag to indicate lack of armv7s architecture

阅读更多关于 Is there a compiler flag to indicate lack of armv7s architecture

With the iPhone 5 and other armv7s devices now appearing, there are compatibility problems with existing (closed-source) 3rd-party frameworks such as Flurry which are built without this newer architecture. One option is to wait until they release a new build, but I was hoping there might be a compiler flag or something I can use in my Xcode project that would let the linker know not to expect armv7s architecture from this framework, and use the armv7 instead. Does anything like this exist? It's not possible to load a framework which doesn't include the targeted architecture. What you could do

Why isn't there a data bus which is as wide as the cache line size?

阅读更多关于 Why isn't there a data bus which is as wide as the cache line size?

When a cache miss occurs, the CPU fetches a whole cache line from main memory into the cache hierarchy. (typically 64 bytes on x86_64) This is done via a data bus, which is only 8 byte wide on modern 64 bit systems. (since the word size is 8 byte) EDIT: "Data bus" means the bus between the CPU die and the DRAM modules in this context. This data bus width does not necessarily correlate with the word size. Depending on the strategy the actually requested address gets fetched at first, and then the rest of the cache line gets fetched sequentially. It would seem much faster if there was a bus with

What is general difference between Superscalar and OoO execution?

阅读更多关于 What is general difference between Superscalar and OoO execution?

I've been reading some material on superscalr and OoO and I am confused. I think their architecture graphs look very much the same. Superscalar microprocessors can execute two or more instructions at the same time. E.g. typically they have at least 2 ALUs (although a superscalar processor might have 1 ALU and some other execution unit, like a shifter or jump unit.) (More precisely, superscalar processors can start executing two or more instructions in the same cycle. Pipelined processors can execute more than one instruction at a time, but a non-superscalar pipelined processor will only start

How to target multiple architectures using NDK?

阅读更多关于 How to target multiple architectures using NDK?

Background I've recently started to develop some code using the NDK, and I've thought of a possible portability problem that could occur while developing using NDK. The problem Since NDK uses native code, it needs to be compiled per CPU architecture. This is a problem since the user needs to run the app no matter what CPU the device has. Possible solutions I've found so far I've noticed I can modify the file "jni/Application.mk" and use: APP_ABI := armeabi armeabi-v7a x86 however, I don't know what I should do from this step on. Will the app contain all of the compiled code for each of the CPU

Does each core has its own private set of registers?

阅读更多关于 Does each core has its own private set of registers?

问题 Looking from this intel core i7 nehalem microarchitecure It seems that each core has it's own private Register file. So I have a couple of short questions, because I thought that there is only 1 set of registers not dependent on number of cores. Does each core has its own private set of registers? (rax,rbx,rsp and so on.) Does each core has it's own MMU and TLB? not just one shared across all cores? I know the questions are highly microarchitecture dependent but I think majority of modern x64

How to determine binary image architecture at runtime?

阅读更多关于 How to determine binary image architecture at runtime?

Crash log contains "Binary Images" section with information about architecture (armv6/armv7) and identifier of all loaded modules. How to determine this information at runtime? (at least, just for application executable) NSBundle has method executableArchitectures, but how to determine which architecture is running? Alright time for the long answer. The mach headers of the dyld images in the application contain the information you are looking for. I have added an example that I only tested to work and nothing else so I would not recommend pasting it directly into production code. What it does