instructions

Why can't a load bypass a value written by another thread on the same core from a write buffer?

不羁岁月 提交于 2019-12-24 10:06:43
问题 If a CPU core uses a write buffer, then the load can bypass the most recent store to the referenced location from the write buffer, without waiting until it will appear in the cache. But, as it's written in A Primer on Memory Consistency and Coherence, if the CPU honors TSO memory model, then ... multithreading introduces a subtle write buffer issue for TSO. TSO write buffers are logically private to each thread context (virtual core). Thus, on a multithreaded core, one thread context should

How does the instruction decoder differentiate between EVEX prefix and BOUND opcode in 32-bit mode?

Deadly 提交于 2019-12-23 07:38:18
问题 In 32-bit mode Intel solves the VEX prefix vs LDS/LES conflict by inverting the high bits of register extension, because the mod field of ModRM byte can't be 11b The VEX prefix's initial-byte values, C4h and C5h, are the same as the opcodes of the LDS and LES instructions. These instructions are not supported in 64-bit mode. To resolve the ambiguity while in 32-bit mode, VEX's specification exploits the fact that a legal LDS or LES's ModRM byte can not be of the form 11xxxxxx (which would

difference between conditional instructions (cmov) and jump instructions [duplicate]

耗尽温柔 提交于 2019-12-21 03:25:12
问题 This question already has answers here : Why is a conditional move not vulnerable for Branch Prediction Failure? (5 answers) Closed 3 years ago . I'm confused where to use cmov instructions and where to use jump instructions in assembly? From performance point of view: What is the difference in both of them? Which one is better? If possible, please explain their difference with an example. 回答1: movcc is a so-called predicated instruction. That's fancy-speak for "this instruction executes

JVM instruction ALOAD_0 in the 'main' method points to 'args' instead of 'this'?

微笑、不失礼 提交于 2019-12-20 17:25:46
问题 I am trying to implement a subset of Java for an academic study. Well, I'm in the last stages (code generation) and I wrote a rather simple program to see how method arguments are handled: class Main { public static void main(String[] args) { System.out.println(args.length); } } Then I built it, and ran 'Main.class' through an online disassembler I found at: http://www.cs.cornell.edu/People/egs/kimera/disassembler.html I get the following implementation for the 'main' method: (the

JVM instruction ALOAD_0 in the 'main' method points to 'args' instead of 'this'?

一个人想着一个人 提交于 2019-12-20 17:25:12
问题 I am trying to implement a subset of Java for an academic study. Well, I'm in the last stages (code generation) and I wrote a rather simple program to see how method arguments are handled: class Main { public static void main(String[] args) { System.out.println(args.length); } } Then I built it, and ran 'Main.class' through an online disassembler I found at: http://www.cs.cornell.edu/People/egs/kimera/disassembler.html I get the following implementation for the 'main' method: (the

Does ret instruction cause esp register added by 4?

随声附和 提交于 2019-12-20 10:26:07
问题 Does "ret" instruction cause "esp" register added by 4? 回答1: Yes, it performs pop eip You can use mov eax, [esp] jmp eax to avoid it. EDIT: It's exactly what ret does. For example, jmp rel_offet is nothing than a hidden add eip, offset , or jmp absolute_offset is mov eip, absolute_offset . Sure there are differences in the way the processor treats them, but from programmer's point of view it's all that happens. Also, there is a special form of ret : ret imm8 that also adds this imm8 value to

Android instructions when open the application at first time? [closed]

▼魔方 西西 提交于 2019-12-18 10:24:42
问题 Closed . This question needs to be more focused. It is not currently accepting answers. Want to improve this question? Update the question so it focuses on one problem only by editing this post. Closed 2 years ago . Do you know this Well I want create something like this screen. When I open for the first time the application I want open this screen and display a context.. How is possible? I don't know what search for this type of thing.. 回答1: @Override public void onCreate(Bundle

C code loop performance [continued]

非 Y 不嫁゛ 提交于 2019-12-18 10:01:04
问题 This question continues on my question here (on the advice of Mystical): C code loop performance Continuing on my question, when i use packed instructions instead of scalar instructions the code using intrinsics would look very similar: for(int i=0; i<size; i+=16) { y1 = _mm_load_ps(output[i]); … y4 = _mm_load_ps(output[i+12]); for(k=0; k<ksize; k++){ for(l=0; l<ksize; l++){ w = _mm_set_ps1(weight[i+k+l]); x1 = _mm_load_ps(input[i+k+l]); y1 = _mm_add_ps(y1,_mm_mul_ps(w,x1)); … x4 = _mm_load

How is it possible that BITWISE AND operation to take more CPU clocks than ARITHMETIC ADDITION operation in a C program?

与世无争的帅哥 提交于 2019-12-18 09:47:08
问题 I wanted to test if bitwise operations really are faster to execute than arithmetic operation. I thought they were. I wrote a small C program to test this hypothesis and to my surprise the addition takes less on average than bitwise AND operation. This is surprising to me and I cannot understand why this is happening. From what I know for addition the carry from the less significant bits should be carried to the next bits because the result depends on the carry too. It does not make sense to

Assembly programming - WinAsm vs Visual Studio 2017

点点圈 提交于 2019-12-18 02:58:16
问题 I'm here to ask you some stuff about VS2017. In the past I had used WinAsm for MASM and I never got problems with it. However, when I'm trying to do something with MASM in VS2017, I always gonna get problems and stuff... I've checked the whole internet about "how to set up VS for MASM", but nothing has helped me as I'm always getting troubles... Is there any way to use Visual Studio 2017 for MASM32/64bit without any kind of headache? Can someone give me the ultimate guide to set up VS2017 for