cpu-architecture

Small RISC emulator

穿精又带淫゛_ 提交于 2019-12-05 04:47:17
问题 I'm looking to build a VM into a game and was wondering if anyone knew of any really simple VM's (I was thinking RISC/PIC was close to what I wanted) that are usually used for embedded projects such as controlling robots, motors, sensors, etc. My main concern is having to write a compiler/assembler if I roll my own. I'd be nice to use the tools that are already out there or in its simplest form just a C compiler that can compile for it :-p. I really don't want to re-invent the wheel here but

How do I find my CPU topology?

流过昼夜 提交于 2019-12-05 03:45:23
I am using Intel(R) Core(TM) i5 CPU M 560 @ 2.67GHz as I found out from cat /proc/cpuinfo . But I want to know exact hierarchy like how many sockets are there, and how many cores are there per socket and threads too, if supported. Any idea? you can use command lscpu this will give information for processor related info dmidecode -t processor Sebastian Kuzminsky lstopo from the hwloc package reports the info you want: Socket L#0 + L3 L#0 (6144KB) L2 L#0 (256KB) + L1 L#0 (32KB) + Core L#0 PU L#0 (P#0) PU L#1 (P#1) L2 L#1 (256KB) + L1 L#1 (32KB) + Core L#1 PU L#2 (P#2) PU L#3 (P#3) L2 L#2 (256KB)

Does a branch misprediction flush the entire pipeline, even for very short if-statement body?

回眸只為那壹抹淺笑 提交于 2019-12-05 02:43:11
Everything I've read seems to indicate that a branch misprediction always results in the entire pipeline being flushed, which means a lot of wasted cycles. I never hear anyone mention any exceptions for short if-conditions. This seems like it would be really wasteful in some cases. For example, suppose you have a lone if-statement with a very simple body that is compiled down to 1 CPU instruction. The if-clause would be compiled into a conditional jump forward by one instruction. If the CPU predicts the branch to not be taken, then it will begin executing the if-body instruction, and can

Why do we need to compile for different platforms (e.g. Windows/Linux)?

我的未来我决定 提交于 2019-12-05 01:13:16
问题 I've learned the basics about CPUs/ASM/C and don't understand why we need to compile C code differently for different OS targets. What the compiler does is create Assembler code that then gets assembled to binary machine code. The ASM code of course is different per CPU architecture (e.g. ARM) as the instruction set architecture is different. But as Linux and Windows run on the same CPU, the machine operations like MOVE/ADD/... should be identical. While I do know that there are OS-specific

Can branch prediction cause illegal instruction?

不问归期 提交于 2019-12-05 01:08:12
In the following pseudo-code: if (rdtscp supported by hardware) { Invoke "rdtscp" instruction } else { Invoke "rdtsc" instruction } Let's say the CPU does not support the rdtscp instruction and so we fallback to the else statement. If CPU mispredicts the branch, is it possible for the instruction pipeline to try to execute rdtscp and throw an Illgal Instruction error? It is explicitly documented for the #UD trap (Invalid Opcode Execution) in the Intel Processor Manuals, Volume 3A, chapter 6.15: In Intel 64 and IA-32 processors that implement out-of-order execution microarchitectures, this

How is an LRU cache implemented in a CPU?

不羁的心 提交于 2019-12-04 22:59:29
问题 I'm studying up for an interview and want to refresh my memory on caching. If a CPU has a cache with an LRU replacement policy, how is that actually implemented on the chip? Would each cache line store a timestamp tick? Also what happens in a dual core system where both CPUs write to the one address simultaneously? 回答1: For a traditional cache with only two ways, a single bit per set can be used to track LRU. On any access to a set that hits, the bit can be set to the way that did not hit.

.csproj's platform specific ItemGroup works for assembly references but not content includes?

北城以北 提交于 2019-12-04 22:41:34
Since we have three assemblies that come in explicit x86 and x64 versions, I've edited the corresponding .csproj file(s) to use, for example, a block like this: <ItemGroup Condition=" '$(Platform)' == 'x86' "> <Reference Include="CaliberRMSDK"> <HintPath>..\Libraries\CaliberRMSDK_IKVM\32bit\CaliberRMSDK.dll</HintPath> </Reference> <Content Include="..\Libraries\CaliberRMSDK_IKVM\32bit\ikvm-native.dll"> <Link>ikvm-native.dll</Link> <CopyToOutputDirectory>Always</CopyToOutputDirectory> </Content> <Content Include="..\Libraries\CaliberRMSDK_IKVM\32bit\JVM.dll"> <Link>JVM.dll</Link>

Is prefetching triggered by the stream of exact addresses or by the stream of cache lines?

那年仲夏 提交于 2019-12-04 19:45:22
问题 On modern x86 CPUs, hardware prefetching is an important technique to bring cache lines into various levels of the cache hierarchy before they are explicitly requested by the user code. The basic idea is that when the processor detects a series of accesses to sequential or strided-sequential 1 locations, it will go ahead and fetch further memory locations in the sequence, even before executing the instructions that (may) actually access those locations. My question is if the detection of a

Why INC and ADD 1 have different performances? [duplicate]

假装没事ソ 提交于 2019-12-04 18:00:11
问题 This question already has answers here : INC instruction vs ADD 1: Does it matter? (2 answers) Closed 2 years ago . I've read many times over the years that you should do XOR ax, ax because it is faster... or when programming in C use counter++ or counter+=1 because they would INC or ADD... Or that in the Netburst Pentium 4 the INC was slower than ADD 1 so the compiler had to be warned that your target was a Netburst so it would translate all var++ to ADD 1... My question is: Why INC and ADD

Convert object file to another architecture

巧了我就是萌 提交于 2019-12-04 16:12:14
I am trying to use a Wifi-Dongle with a Raspberry Pi. The vendor of the dongle provides a Linux driver that I can compile successfully on the ARM-architecture, however, one object file, that comes with the driver, was precompiled for a x86-architecture, which causes the linker to fail. I know it would be much easier to compile that (quite big) file again, but I don't have access to the source code. Is it possible to convert that object file from a x86-architecture to an ARM-architecture? Thank you! Um, no, it looks to me like a waste of time. Wi-Fi driver is complex, and you say this one