cpu-architecture | 易学教程

Detect CPU Architecture (32-bit / 64-bit) runtime in Objective C (Mac OS X)

阅读更多关于 Detect CPU Architecture (32-bit / 64-bit) runtime in Objective C (Mac OS X)

I'm currently wring a Cocoa application which needs to execute some (console) applications which are optimized for 32 and 64 bit. Because of this I would like to detect what CPU architecture the application is running on so I can start the correct console application. So in short: how do I detect if the application is running on a 64 bit OS? Edit: I know about the Mach-O fat binaries, that was not my question. I need to know this so I can start another non bundled (console) application. One that is optimized for x86 and one for x64 . There is a super-easy way. Compile two versions of the

Difference between “machine hardware” and “hardware platform”

阅读更多关于 Difference between “machine hardware” and “hardware platform”

问题 My Linux machine reports "uname -a" outputs as below: [root@tom i386]# uname -a Linux tom 2.6.9-89.ELsmp #1 SMP Mon Apr 20 10:34:33 EDT 2009 i686 i686 i386 GNU/Linux [root@tom i386]# As per man page of uname, the entries "i686 i686 i386" denotes: machine hardware name (i686) processor type (i686) hardware platform (i386) Additional information: [root@tom i386]# cat /proc/cpuinfo <snip> vendor_id : GenuineIntel CPU family : 6 model : 15 model name : Intel(R) Xeon(R) CPU 5148 @ 2.33 GHz

Assembly PC Relative Addressing Mode

阅读更多关于 Assembly PC Relative Addressing Mode

I am working on datapaths and have been trying to understand branch instructions. So this is what I understand. In MIPS, every instruction is 32 bits. This is 4 bytes. So the next instruction would be four bytes away. In terms of example, I say PC address is 128. My first issue is understanding what this 128 means. My current belief is that it is an index in the memory, so 128 refers to 128 bytes across in the memory. Therefore, in the datapath it always says to add 4 to the PC. Add 4 bits to the 128 bits makes 132, but this is actually 132 bytes across now (next instruction). This is the way

Cache coherence literature generally only refers store buffers but not read buffers. Yet one somehow needs both?

阅读更多关于 Cache coherence literature generally only refers store buffers but not read buffers. Yet one somehow needs both?

When reading about consistency models (namely the x86 TSO), authors in general resort to models where there are a bunch of CPUs, their associated store buffers and their private caches. If my understanding is correct, store buffers can be described as queues where CPUs may put any store instruction they want to commit to memory. So as the name states, they are store buffers. But when I read those papers, they tend to talk about the interaction of loads and stores, with statements such as "a later load can pass an earlier store" which is slightly confusing, as they almost seem to be talking as

cpu cacheline and prefetch policy

阅读更多关于 cpu cacheline and prefetch policy

I read this article http://igoro.com/archive/gallery-of-processor-cache-effects/ . The article said that because cacheline delay, the code: int[] arr = new int[64 * 1024 * 1024]; // Loop 1 for (int i = 0; i < arr.Length; i++) arr[i] *= 3; // Loop 2 for (int i = 0; i < arr.Length; i += 16) arr[i] *= 3; will almost have same execute time, and I wrote some sample c code to test it. I run the code on Xeon(R) E3-1230 V2 with Ubuntu 64bit, ARMv6-compatible processor rev 7 with Debian, and also run it on Core 2 T6600. All results are not what the article said. My code is as follows: long int jobTime

Detecting Aligned Memory requirement on target CPU

阅读更多关于 Detecting Aligned Memory requirement on target CPU

I'm currently trying to build a code which is supposed to work on a wide range of machines, from handheld pockets and sensors to big servers in data centers. One of the (many) differences between these architectures is the requirement for aligned memory access. Aligned memory access is not required on "standard" x86 CPU, but many other CPU need it and produce an exception if the rule is not respected. Up to now, i've been dealing with it by forcing the compiler to be cautious on specific data accesses which are known to be risky, using the packed attribute (or pragma). And it works fine. The

Why use SIMD if we have GPGPU? [closed]

阅读更多关于 Why use SIMD if we have GPGPU? [closed]

问题 Closed . This question is opinion-based. It is not currently accepting answers. Want to improve this question? Update the question so it can be answered with facts and citations by editing this post. Closed 5 years ago . Now that we have GPGPUs with languages like CUDA and OpenCL, do the multimedia SIMD extensions (SSE/AVX/NEON) still serve a purpose? I read an article recently about how SSE instructions could be used to accelerate sorting networks. I thought this was pretty neat but when I

How to find the size of the L1 cache line size with IO timing measurements?

阅读更多关于 How to find the size of the L1 cache line size with IO timing measurements?

问题 As a school assignment, I need to find a way to get the L1 data cache line size, without reading config files or using api calls. Supposed to use memory accesses read/write timings to analyze & get this info. So how might I do that? In an incomplete try for another part of the assignment, to find the levels & size of cache, I have: for (i = 0; i < steps; i++) { arr[(i * 4) & lengthMod]++; } I was thinking maybe I just need vary line 2, (i * 4) part? So once I exceed the cache line size, I

Program Counter and Instruction Register

阅读更多关于 Program Counter and Instruction Register

问题 Program counter holds the address of the instruction that should be executed next, while instruction register holds the actual instruction to be executed. wouldn't one of them be enough? And what is the length of each one of these registers? Thanks. 回答1: You will need both always. The program counter (PC) holds the address of the next instruction to be executed, while the instruction register (IR) holds the encoded instruction. Upon fetching the instruction, the program counter is incremented

Why is x86 ugly? Why is it considered inferior when compared to others? [closed]

阅读更多关于 Why is x86 ugly? Why is it considered inferior when compared to others? [closed]

问题 As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance. Closed 7 years ago . Recently I've been reading some SO archives and encountered statements against the x86 architecture. Why do we need different CPU