i just started learning assembly and making some custom loop for swapping two variables using C++ \'s asm{} body with Digital-Mars compiler in C-Free 5.0
Enabled th
The code generated by that compiler is pretty horrible. After disassembling the object file with objconv
, here's what I got in regards to the first for
loop.
?_001: cmp dword [ebp-4H], 200000000 ; 0053 _ 81. 7D, FC, 0BEBC200
jge ?_002 ; 005A _ 7D, 17
inc dword [ebp-4H] ; 005C _ FF. 45, FC
mov eax, dword [ebp-18H] ; 005F _ 8B. 45, E8
mov dword [ebp-10H], eax ; 0062 _ 89. 45, F0
mov eax, dword [ebp-14H] ; 0065 _ 8B. 45, EC
mov dword [ebp-18H], eax ; 0068 _ 89. 45, E8
mov eax, dword [ebp-10H] ; 006B _ 8B. 45, F0
mov dword [ebp-14H], eax ; 006E _ 89. 45, EC
jmp ?_001 ; 0071 _ EB, E0
The issues should be clear to anybody who ever looked at some assembly.
The loop is very tightly dependent on the value that is put in eax
. This makes any out-of-order execution practically impossible due to dependencies created on that register by every next instruction.
There are six general-purpose registers available (since ebp
and esp
aren't really general-purpose in most of the setups), but your compiler uses none of them, falling back to using the local stack. This is absolutely unacceptable when speed is the optimization goal. We can even see that the current loop index is stored at [ebp-4H]
, while it could've been easily stored in a register.
The cmp
instruction uses a memory and an immediate operand. This is the slowest possible mix of operands and should never be used when performance is at stake.
And don't get me started on the code size. Half of those instructions are just unnecessary.
All in all, the first thing I'd do is ditch that compiler at the earliest possible chance. But then again, seeing that it offers "memory models" as one of its options, one can't really seem to have much hope.