cpu-architecture

missing required architecture x86_64

Deadly 提交于 2019-12-04 06:26:33
I have an old project, that I recompiled for an uodate, and it is now showing this error message: …. missing required architecture x86_64 in file myLibrary.a …. I have tried various tricks that I could find on the net after searching on missing required architecture x86_64 in file , but with no success. Anyone knows how to properly handle the issue? I am using Xcode Version 7.0.1. Running: lipo -info myLibrary.a shows: Architectures in the fat file: myLibrary.a are: armv7 arm64 I have been able to add armv7s but not x86_64. You are trying to build a universal library and it does not have all

How exactly to count the hit rate of a direct mapped cache?

≡放荡痞女 提交于 2019-12-04 05:51:38
问题 We got a cache given with 8 frames and it's directly mapped. Following access sequence on main memory blocks has been observed: 2 5 0 13 2 5 10 8 0 4 5 2 Count the hit rate of this organized cache. Solution: I understand how and why the numbers are placed in the table like that. But I don't understand why 2 and 5 have been bold-printed and why we got hit rate of 17%. This has been solved by our professor but I don't understand it completely. 回答1: Like was mentioned by @Margaret Bloom in the

How can x86 bsr/bsf have fixed latency, not data dependent? Doesn't it loop over bits like the pseudocode shows?

試著忘記壹切 提交于 2019-12-04 04:38:47
I am on the hook to analyze some "timing channels" of some x86 binary code. I am posting one question to comprehend the bsf/bsr opcodes. So high-levelly, these two opcodes can be modeled as a "loop", which counts the leading and trailing zeros of a given operand. The x86 manual has a good formalization of these opcodes, something like the following: IF SRC = 0 THEN ZF ← 1; DEST is undefined; ELSE ZF ← 0; temp ← OperandSize – 1; WHILE Bit(SRC, temp) = 0 DO temp ← temp - 1; OD; DEST ← temp; FI; But to my suprise, bsf/bsr instructions seem to have fixed cpu cycles . According to some documents I

Cache specifications for intel core i7

非 Y 不嫁゛ 提交于 2019-12-04 03:51:05
I am building a cache simulator for a intel core i7 but have a hard time finding the detailed specifications for the L1, L2 and L3 cache (shared). I need the Cacheblock size, cache size, associativity and so on... Can anyone point me in the good direction? Intel's Optimization guide describes most of the required specifications per architectural generation (you didn't specify which i7 you have, there are now several generations since Nehalem and up to Haswell). Haswell, for e.g., would have - Note that if you're building a simulator, you'll want to have as many of these feature as possible

How does Spectre attack read the cache it tricked CPU to load?

戏子无情 提交于 2019-12-04 03:23:46
I understand the part of the paper where they trick the CPU to speculatively load the part of the victim memory into the CPU cache. Part I do not understand is how they retrieve it from cache. They don't retrieve it directly (out of bounds read bytes are not "retired" by the CPU and cannot be seen by the attacker in the attack). A vector of attack is to do the "retrieval" a bit at a time. After the CPU cache has been prepared (flushing the cache where it has to be), and has been "taught" that a if branch goes through while the condition relies on non-cached data, the CPU speculatively executes

Why doesn't my processor have built-in BigInt support?

一个人想着一个人 提交于 2019-12-04 03:09:37
As far as I understood it, BigInts are usually implemented in most programming languages as arrays containing digits, where, eg.: when adding two of them, each digit is added one after another like we know it from school, e.g.: 246 816 * * ---- 1062 Where * marks that there was an overflow. I learned it this way at school and all BigInt adding functions I've implemented work similar to the example above. So we all know that our processors can only natively manage ints from 0 to 2^32 / 2^64 . That means that most scripting languages in order to be high-level and offer arithmetics with big

What does “store-buffer forwarding” mean in the Intel developer's manual?

痞子三分冷 提交于 2019-12-04 02:13:22
The Intel 64 and IA-32 Architectures Software Developer's Manual says the following about re-ordering of actions by a single processor (Section 8.2.2, "Memory Ordering in P6 and More Recent Processor Families"): Reads may be reordered with older writes to different locations but not with older writes to the same location. Then below when discussing points where this is relaxed compared to earlier processors, it says: Store-buffer forwarding, when a read passes a write to the same memory location. As far as I can tell, "store-buffer forwarding" isn't precisely defined anywhere (and neither is

Small RISC emulator

为君一笑 提交于 2019-12-03 20:26:29
I'm looking to build a VM into a game and was wondering if anyone knew of any really simple VM's (I was thinking RISC/PIC was close to what I wanted) that are usually used for embedded projects such as controlling robots, motors, sensors, etc. My main concern is having to write a compiler/assembler if I roll my own. I'd be nice to use the tools that are already out there or in its simplest form just a C compiler that can compile for it :-p. I really don't want to re-invent the wheel here but I also need thousands of these running around a virtual world so they have to be as simple and as fast

Porting 32 bit C++ code to 64 bit - is it worth it? Why?

两盒软妹~` 提交于 2019-12-03 18:32:15
问题 I am aware of some the obvious gains of the x64 architecture (higher addressable RAM addresses, etc)... but: What if my program has no real need to run in native 64 bit mode. Should I port it anyway? Are there any foreseeable deadlines for ending 32 bit support? Would my application run faster / better / more secure as native x64 code? 回答1: x86-64 is a bit of a special case - for many architectures (eg. SPARC), compiling an application for 64 bit mode doesn't give it any benefit unless it can

Critical sections with multicore processors

这一生的挚爱 提交于 2019-12-03 18:30:28
问题 With a single-core processor, where all your threads are run from the one single CPU, the idea of implementing a critical section using an atomic test-and-set operation on some mutex (or semaphore or etc) in memory seems straightforward enough; because your processor is executing a test-and-set from one spot in your program, it necessarily can't be doing one from another spot in your program disguised as some other thread. But what happens when you do actually have more than one physical