I keep seeing people claim that the MOV instruction can be free in x86, because of register renaming.
For the life of me, I can\'t verify this in a single test cas
Here are two small tests that I believe conclusively show evidence for mov-elimination:
__loop1:
add edx, 1
add edx, 1
add ecx, 1
jnc __loop1
versus
__loop2:
mov eax, edx
add eax, 1
mov edx, eax
add edx, 1
add ecx, 1
jnc __loop2
If mov
added a cycle to a dependency chain, it would be expected that the second version takes about 4 cycles per iteration. On my Haswell, both take about 2 cycles per iteration, which cannot happen without mov-elimination.