x86-64

LLVM insertvalue bad optimized?

纵然是瞬间 提交于 2019-12-01 18:19:05
Should I avoid using the 'insertvalue' instruction combined with load and store when I emit LLVM code? I always get bad optimized native code when I use it. Look at the following example: ; ModuleID = 'mod' target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64" target triple = "x86_64-pc-linux-gnu" %A = type { i64, i64, i64, i64, i64, i64, i64, i64 } @aa = external global %A* define void @func() { entry: %a1 = load %A** @aa %a2 = load %A* %a1 %a3 = insertvalue %A %a2, i64 3, 3 store %A

FLD instruction x64 bit

倖福魔咒の 提交于 2019-12-01 17:59:08
I have a little problem with FLD instruction in x64 bit ... want to load Double value to the stack pointer FPU in st0 register, but it seem to be impossible. In Delphi x32, I can use this code : function DoSomething(X:Double):Double; asm FLD X // Do Something .. FST Result end; Unfortunately, in x64, the same code does not work. In x64 mode floating point parameters are passed in xmm-registers. So when Delphi tries to compile FLD X, it becomes FLD xmm0 but there is no such instruction. You first need to move it to memory. The same goes with the result, it should be passed back in xmm0. Try

x86 instruction encoding how to choose opcode

筅森魡賤 提交于 2019-12-01 17:57:21
When encode instruction cmpw %ax -5 for x86-64, from Intel-instruction-set-reference-manual, I have two opcodes to choose: 3D iw CMP AX, imm16 I Valid Valid Compare imm16 with AX. 83 /7 ib CMP r/m16, imm8 MI Valid Valid Compare imm8 with r/m16. So there will be two encoding results: 66 3d fb ff ; this for opcode 3d 66 83 f8 fb ; this for opcode 83 Then which one is better? I tried some online-disassembler below https://defuse.ca/online-x86-assembler.htm#disassembly2 https://onlinedisassembler.com/odaweb/ Both can disassemble to origin instruction. But why 6683fb00 also works and 663dfb doesn't

Sign or Zero Extension of address in 64bit mode for MOV moffs32?

别等时光非礼了梦想. 提交于 2019-12-01 17:31:36
问题 Let's have an instruction MOV EAX,[0xFFFFFFFF] encoded in 64bit mode as 67A1FFFFFFFF (effective address-size is toggled by 67 prefix from default 64 to 32 bits). Intel's instruction reference manual (doc Order Number: 325383-057US from December 2015) on page Vol. 2A 2-11 says: 2.2.1.3 Displacement Addressing in 64-bit mode uses existing 32-bit ModR/M and SIB encodings. The ModR/M and SIB sizes do not change. They remain 8 bits or 32 bits and are sign-extended to 64 bits. This suggests that

FLD instruction x64 bit

南笙酒味 提交于 2019-12-01 17:13:27
问题 I have a little problem with FLD instruction in x64 bit ... want to load Double value to the stack pointer FPU in st0 register, but it seem to be impossible. In Delphi x32, I can use this code : function DoSomething(X:Double):Double; asm FLD X // Do Something .. FST Result end; Unfortunately, in x64, the same code does not work. 回答1: In x64 mode floating point parameters are passed in xmm-registers. So when Delphi tries to compile FLD X, it becomes FLD xmm0 but there is no such instruction.

assembly cltq and movslq difference

ⅰ亾dé卋堺 提交于 2019-12-01 16:45:50
Chapter 3 of Computer Systems A Programmer's Perspective (2nd Edition) mentions that cltq is equivalent to movslq %eax, %rax . Why did they create a new instruction ( cltq ) instead of just using movslq %eax,%rax ? Isn't that redundant? TL;DR : use cltq when possible, because it's one byte shorter than the exactly-equivalent movslq %eax, %rax . That's a very minor advantage (so don't sacrifice anything else to make this happen) but choose eax if you're going to want to sign-extend it a lot. This is mostly relevant for compiler-writers (compiling signed-integer loop counters indexing arrays);

Performance of “conditional call” on amd64

回眸只為那壹抹淺笑 提交于 2019-12-01 16:17:09
When considering a conditional function call in a critical section of code I found that both gcc and clang will branch around the call. For example, for the following (admittedly trivial) code: int32_t __attribute__((noinline)) negate(int32_t num) { return -num; } int32_t f(int32_t num) { int32_t x = num < 0 ? negate(num) : num; return 2*x + 1; } Both GCC and clang compile to essentially the following: .global _f _f: cmp edi, 0 jg after_call call _negate after_call: lea rax, [rax*2+1] ret This got me thinking: what if x86 had a conditional call instruction like ARM? Imagine if there was such

What's the difference between the x86-64 AT&T instructions movq and movabsq?

点点圈 提交于 2019-12-01 16:11:24
After reading this stack overflow answer , and this document , I still don't understand the difference between movq and movabsq . My current understanding is that in movabsq , the first operand is a 64-bit immediate operand whereas movq sign-extends a 32-bit immediate operand. From the 2nd document referenced above: Moving immediate data to a 64-bit register can be done either with the movq instruction, which will sign extend a 32-bit immediate value, or with the movabsq instruction, when a full 64-bit immediate is required. In the first reference, Peter states: Interesting experiment: movq

How much does function alignment actually matter on modern processors?

£可爱£侵袭症+ 提交于 2019-12-01 16:07:27
When I compile C code with a recent compiler on an amd64 or x86 system, functions are aligned to a multiple of 16 bytes. How much does this alignment actually matter on modern processors? Is there a huge performance penalty associated with calling an unaligned function? Benchmark I ran the following microbenchmark ( call.S ): // benchmarking performance penalty of function alignment. #include <sys/syscall.h> #ifndef SKIP # error "SKIP undefined" #endif #define COUNT 1073741824 .globl _start .type _start,@function _start: mov $COUNT,%rcx 0: call test dec %rcx jnz 0b mov $SYS_exit,%rax xor %edi,

What's the difference between the x86-64 AT&T instructions movq and movabsq?

孤者浪人 提交于 2019-12-01 16:05:26
问题 After reading this stack overflow answer, and this document, I still don't understand the difference between movq and movabsq . My current understanding is that in movabsq , the first operand is a 64-bit immediate operand whereas movq sign-extends a 32-bit immediate operand. From the 2nd document referenced above: Moving immediate data to a 64-bit register can be done either with the movq instruction, which will sign extend a 32-bit immediate value, or with the movabsq instruction, when a