cpu-architecture | 易学教程

Is it possible for the RESOURCE_STALLS.RS event to occur even when the RS is not completely full?

阅读更多关于 Is it possible for the RESOURCE_STALLS.RS event to occur even when the RS is not completely full?

The description of the RESOURCE_STALLS.RS hardware performance event for Intel Broadwell is the following: This event counts stall cycles caused by absence of eligible entries in the reservation station (RS). This may result from RS overflow, or from RS deallocation because of the RS array Write Port allocation scheme (each RS entry has two write ports instead of four. As a result, empty entries could not be used, although RS is not really full). This counts cycles that the pipeline backend blocked uop delivery from the front end. This basically says that there are two situations where the RS

Relation between endianness and stack-growth direction

阅读更多关于 Relation between endianness and stack-growth direction

Is there a relation between endianness of a processor and the direction of stack growth? For example, x86 architecture is little endian and the stack grows downwards (i.e. it starts at highest address and grows towards lower address with each push operation). Similarly, in SPARC architecture , which is big endian , the stack starts at lowest address and grows upwards towards higher addresses. This relationship pattern is seen in almost all architectures. I believe there must be a reason for this unsaid convention. Can this be explained from computer architecture or OS point of view? Is this

According to Intel my cache should be 24-way associative though its 12-way, how is that?

阅读更多关于 According to Intel my cache should be 24-way associative though its 12-way, how is that?

According to “Intel 64 and IA-32 architectures optimization reference manual,” April 2012 page 2-23 The physical addresses of data kept in the LLC data arrays are distributed among the cache slices by a hash function, such that addresses are uniformly distributed. The data array in a cache block may have 4/8/12/16 ways corresponding to 0.5M/1M/1.5M/2M block size. However, due to the address distribution among the cache blocks from the software point of view, this does not appear as a normal N-way cache. My computer is a 2-core Sandy Bridge with a 3 MB, 12-way set associative LLC cache. That

What is the purpose of the reserved/undefined bit in the flag register?

阅读更多关于 What is the purpose of the reserved/undefined bit in the flag register?

In the flag register of Z80, 8080, 8085, and 8086 processors, what is the purpose of bits 1, 3, 5, which are documented as "reserved" or "undefined"? Konamiman These bits are unused; that is, no instruction explicitly sets them to any value. The designers decided that 5/6 flags was enough, and they just left the remaining bits of the flags register unused. They are documented as being "undefined" because it is not possible to know in advance which value will they have after any of the instructions are executed—the processor design is simpler that way, as opposed to setting them explicitly to 0

What's the advantage of compiler instruction scheduling compared to dynamic scheduling? [closed]

阅读更多关于 What's the advantage of compiler instruction scheduling compared to dynamic scheduling? [closed]

Nowadays, super-scalar RISC cpus usually support out-of-order execution, with branch prediction and speculative execution. They schedule work dynamically. What's the advantage of compiler instruction scheduling, compared to an out-of-order CPU's dynamic scheduling? Does compile-time static scheduling matter at all for an out-of-order CPU, or only for simple in-order CPUs? It seems currently most software instruction scheduling work focuses on VLIW or simple CPUs. The GCC wiki's scheduling page also shows not much interest in updating gcc's scheduling algorithms. Advantage of static (compiler)

GCC highest set of instructions compatible with multiple architectures

阅读更多关于 GCC highest set of instructions compatible with multiple architectures

I am running jobs on a cluster composed of machines with different architectures: gcc -march=native -Q --help=target | grep -- '-march=' | cut -f3 gives me one of these: broadwell , haswell , ivybridge , sandybridge or skylake . The executable needs to be the same, so I cannot use -march=native but at the same time the architectures have things in common (I think they all support AVX?). I am aware that gcc (contrary to Intel icc ) does not allow for multiple archictures in a single executable. What I would like to know is if there is a way to ask gcc for the highest set of instructions

Difference between Memory Mapped I/O and Programmed I/O

阅读更多关于 Difference between Memory Mapped I/O and Programmed I/O

While going through computer Architecture, I learnt different method of controlling I/O device which are, Programmed I/O Interrupt I/O DMA I learnt all three methods. But I come across another term Memory Mapped I/O . Is there any relation between Programmed I/O and Memory Mapped I/O ? I am confused with these two. Are they similar? Those terms are mostly independent and not mutually exclusive. Below I'll use a pseudo-assembly code to make the examples clearer, it is a demonstrative code, not real code. How do I access a device? If the device is accessible in a dedicated address space,

Using System.getProperty(“os.arch”) to check if it is armeabi cpu

阅读更多关于 Using System.getProperty(“os.arch”) to check if it is armeabi cpu

I'm having the following issue with RenderScript on some old 4.2.2- devices (galaxy s3 mini, galaxy ace 3, galaxy fresh, etc.) - Android - Renderscript Support Library - Error loading RS jni library . I want to implement the suggested solution but what exactly will be the value returned by System.getProperty("os.arch"); for armeabi devices (not armeabi-v7 devices). Thanks. The method System.getProperty is a generic method of Java, here you can find the documentation. On Linux it returns the same value obtained from the command uname -m . The possible values are for example armv5t , armv5te ,

Adding my own library to Contiki OS

阅读更多关于 Adding my own library to Contiki OS

I want to add some third party libraries to Contiki, but at the moment I can't. So I wanted to just test with a simple library. I wrote two files hello.c hello.h, in hello.c I have: printf(" Hello everbody, library call\n"); In hello.h I have: extern void print_hello(); I created hello.o using the command: msp430-gcc -mmcu=msp430f1611 hello.c -o hello.o I created an archive file: ar -cvq libhello.a hello.o I move to contiki, i write a simple program that calls hello.h to execute a function.I try to include hello.a using PROJECT LIBRARIES variable in the makefile, when i compile i get this :

Detecting architecture at compile time from MASM/MASM64

阅读更多关于 Detecting architecture at compile time from MASM/MASM64

How can I detect at compile time from an ASM source file if the target architecture is I386 or AMD64? I am using masm(ml.exe)/masm64(ml64.exe) to assemble file32.asm and file64.asm. It would be nice to create a single file, file.asm, which should include either file32.asm, or file64.asm, depending on the architecture. Ideally, I would like to be able to write something like: IFDEF amd64 include file64.asm ELSE include file32.asm ENDIF Also, if needed, I can run ml.exe and ml64.exe with different command line options. Thanks! If I understand you correctly, you're looking for some sort of built

订阅 cpu-architecture