Finding the absolute value of a number in 8085 microprocessor assembly language

霸气de小男生 提交于 2020-01-24 12:11:25

问题


I have a task of finding the absolute value of any given number in 8085 assembly language.

The algorithm is the following (found on the internet):

mask = n >> 7 (number itself is 8 bits)

(mask + n) XOR mask

My question is that how would I implement this in assembly language. It seems that I should be using the "RRC" command but that performs circular shift on the number and the algorithm doesnt seem to work.

Any ideas would be appreciated. Cheers.


回答1:


The n>>7 in that abs algorithm is an arithmetic right shift that shifts in copies of the sign bit, so you get -1 for negative n, 0 for non-negative. (In 2's complement, the bit pattern for -1 has all bits set).

Then you use this to do either nothing (n+0) ^ 0 or to do 2's complement negation "manually" as -n = (n + (-1)) ^ -1 = ~(n-1).

See How to prove that the C statement -x, ~x+1, and ~(x-1) yield the same results? for 2's complement identities. XOR with all-ones is bitwise NOT. Adding mask = -1 is of course n-1


Branches are cheap, and the register copying involved in creating and using a 0 or -1 (according to the sign of a number) adds up. (Although I did come up with a way to implement this in only 6 bytes of code, same code size as the branchy version.)

On 8085, just implement it the simple way: if(n<0) n=-n;

(Treat the result as unsigned; note that -0x80 = 0x80 in 8-bit. If you assume it's signed-positive after abs, you'll be wrong for the most-negative input.)

This should be trivial with a conditional branch conditional branch over a negation; 8085 does have branches that depend on the sign bit. (Not signed-compare in general though, unless you use the undocumented k flag = signed overflow). Set flags according to A, then JP over a negation. (The "Plus" condition tests that Sign flag = 0, so it's actually testing for non-negative instead of strictly positive)

I don't see a neg instruction in https://www.daenotes.com/electronics/digital-electronics/instruction-set-intel-8085 so you you could zero another register and sub, or you could negate the accumulator in place with a 2's complement identity like CMA (NOT A) ; inr a (accumulator += 1) instead of mov to another reg and subtracting from A=0.

8085 has cheap branching, not like a modern pipelined CPU where branching can be expensive on branch mis-predictions. The mask = n >> 31 or equivalent for branchless abs is useful there and the whole thing is typically only 3 or 4 instructions. (8085 only has shift-by-1 instructions; later ISAs including modern x86 have fast immediate shifts that can do n >> 31 in a single instruction, usually with good latency like 1 cycle.)

Untested A = abs(A)

; total 6 bytes.  (jumps are opcode + 16-bit absolute target address)
    ana  A              ; set flags from A&A
    jp  non_negative    ; jump if MSB was clear
    cma
    inr  A              ; A = ~A+1 = -A
 non_negative:
   ; unsigned A = abs(signed A) at this point

http://pastraiser.com/cpu/i8085/i8085_opcodes.html has an opcode map with cycle timings. 1-byte ALU register instructions take 4 cycles, 2-byte ALU reg instructions (with an immediate) take 7. Conditional branches take 7 cycles not-taken, 10 cycles taken.

  • For non_negative inputs (taken): cost in cycles is 4 (ANA) + 10 (JP) = 14 cycles
  • For negative inputs (not-taken): 4(ANA) + 7(JP) + 4 + 4 = 19 cycles.

(Timing calculations seem to be trivial; each instruction just has a single fixed cost, unlike modern pipelined superscalar out-of-order CPUs where throughput and latency are separate things and not every instruction can run on every execution port...)


Implementing the branchless bithack on 8085

SBB A sets A = 0 or -1 according to CF

This is a somewhat well-known assembly trick for turning a compare condition into a 0 / -1 mask. You just need to get the MSB of your value into the carry flag, e.g. with A+A or a rotate. That gives you the n >> 7 0 : -1 value you need for xor/add.

Just for fun, I tried implementing abs() branchlessly with this trick. This is the best I've come up with. Only use this if you need immunity to timing attacks, so clock cycle cost doesn't depend on input data. (Or for position-independent code; jumps use an absolute target address, not a +- relative offset.)

It has the advantage of keeping the original around in another register.

;;;   UNTESTED slower branchless abs
;; a = abs(b).  destroys c (or pick any other tmp reg)
;; these are all 1-byte instructions (4 cycles each)
   mov  a, b
   add  a         ; CF = sign bit
   sbb  a         ; A = n-n-CF = -CF.  0 or -1
   mov  c, a
   xra  b         ;  n         or    ~n
   sub  a, c      ; n-0 = n    or    ~n-(-1) = ~n+1 = -n

; uint8_t A = abs(int8_t B)

This is still only 6 bytes, same as branchy, but it costs 6*4 = 24 cycles.

If XRA didn't affect flags we could sbi 0 for the -1 step. But it does always clear CF. I don't see a way around saving a copy of the 0 / -1 result. And we can't compute into B to do it in-place; 8085 is an accumulator machine. Where's 8086's 1-byte exchange-with-accumulator when you need it? xchg a,b would have been useful.

If your value starts in A, you need to copy it somewhere else, so you need to destroy two other registers.


A worse alternative for broadcasting the sign bit of A to all positions:

   RLC     ; low bit of accumulator = previous sign bit
   CMA     ; Bitwise NOT: 0 for negative, 1 for non-negative
   ANI  1  ; isolate it, clearing higher bits
   DCR  A  ; 0 or 1  -> -1 or 0

This is even worse than rlc / sbb a; I include it only as an exercise in bit-manipulation to see why it works. (And because I'd already typed it up before remembering that the SBB trick I know from other ISAs will work here, too.)



来源:https://stackoverflow.com/questions/59188194/finding-the-absolute-value-of-a-number-in-8085-microprocessor-assembly-language

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!