x86 assembly: Irvine32 - Get last element of an array

问题

I'm new to Assembly, I need help with an assignment in Assembly Language Irvine32. I want to know where I'm going wrong. I believe my code is 80% right but there's something I'm not seeing or recognizing. Here's the program details.

"Write an assembly language program that has an array of words. The program loads the last element of the array into an appropriately sized register and prints it. (Do not hardcode the index of the last element.)"

INCLUDE Irvine32.inc    
.data
  val1 word 1,2,3,4,5,6
  val2 = ($-val1)/2
.code
main PROC        
  mov ax, 0
  mov ax, val1[val2]

  Call WriteDec
  Call DumpRegs
 exit
main ENDP
END main

回答1:

First of all, your code has a bug: val1[val2] indexes with the element count in words, not the length in bytes (unless MASM syntax is even more magical than I expect). And it reads from one past the end of the array, since the first element is at val1[0].

To find the end, you either need to know the length (explicit length, like a buffer passed to memcpy(3)), or search it for a sentinel element (implicit length, like a C string passed to strcpy(3)).

Having a function that accepts an explicit length as a parameter seems fine to me. It's obviously much more efficient than a loop scanning for a sentinel element, and the array shown doesn't include one. (See Jose's answer for a suggestion to use '$' (i.e. 36) as a sentinel value. -1 or 0 might be more sensible sentinels/terminators.)

Obviously knowing the length is much better, since there's no need for a loop scanning the whole array.

I'd only call it hard-coding if you wrote val2 = 6, or worse val2 dw 6, rather than having it calculated at assemble-time from the array. If you want to write a function that could work with non-compile-time-constant arrays, you can have it accept the length as a value in memory, instead of an immediate that will be embedded into its load instruction.

e.g.

Length as a parameter in memory

.data
  array word 1,2,3,4,5,6
  array_len word ($-array)/2    ; some assemblers have syntactic sugar to calc this for you, like a SIZE operator or something.

.code
main PROC       ; inputs: array and array_len in static storage
                ; output: ax = last element of array
                ; clobbers: si

  ; mov ax, 0   ; This is useless, the next mov overwrites it.

  mov si, [array_len] ; do we need to save/restore si with push/pop in this ABI?

  add si,si           ; multiply by 2: length in words -> length in bytes
  mov ax, [array + si - 2]   ; note that the -2 folds into array at assemble time, so it's just a disp16 + index addressing mode

  Call WriteDec
  Call DumpRegs
 exit
main ENDP
END main

You could also write a function to take pointer and length args on the stack or in registers, and have main pass those args.

You could save the add (or shl) by accepting a length in bytes, or a start and one-past-the-end pointer (like C++ STL range functions that take .begin() and .end() iterators). If you have the end pointer, you don't need the start pointer at all, except to return an error if they're equal (size = 0).

Or if you were not stuck with obsolete 16bit code, you could use a scaled index in the addressing mode, like [array + esi * 2]. You include Irvine32.inc...

回答2:

I think your solution to reach the last element is the most efficient (($-val1)/2), but @zx485 is right and your teacher might believe you are cheating, so, among other solutions, you can reach the last element with a loop and the pointer SI :

INCLUDE Irvine32.inc    
.data
  val1 word 1,2,3,4,5,6
  val2 = ($-val1)/2
.code
main PROC        
; mov ax, 0
; mov ax, val1[val2]

  mov cx, val2-1        ;COUNTER FOR LOOP (LENGTH-1).
  mov si, offset val1   ;SI POINTS TO FIRST WORD IN ARRAY.
repeat:
  add si, 2             ;POINT TO NEXT WORD IN ARRAY.  
  loop repeat           ;CX--, IF CX > 0 REPEAT.

  mov ax, [ si ]        ;LAST WORD!

  Call WriteDec
  Call DumpRegs
 exit
main ENDP
END main

One shorter way would be to get rid of the loop and jump straight to the last element by using the SI pointer (and changing val2 just a little) :

INCLUDE Irvine32.inc    
.data
  val1 dw 1,2,3,4,5,6
  val2 = ($-val1)-2      ;NOW WE GET LENGTH - 2 BYTES.
.code
main PROC           
; mov ax, 0
; mov ax, val1[val2]

  mov si, offset val1   ;SI POINTS TO FIRST WORD IN ARRAY.
  add si, val2          ;SI POINTS TO THE LAST WORD.
  mov ax, [ si ]        ;LAST WORD!

  Call WriteDec
  Call DumpRegs
 exit
main ENDP
END main

And "Yes", you can join those two lines :

  mov si, offset val1   ;SI POINTS TO FIRST WORD IN ARRAY.
  add si, val2          ;SI POINTS TO THE LAST WORD.

into one, I separated them to comment each other :

  mov si, offset val1 + val2

If you cannot use val2 = ($-val1)/2, one option would be to choose some terminating character for the array, for example, '$', and loop until it's found:

INCLUDE Irvine32.inc    
.data
  val1 word 1,2,3,4,5,6,'$'                ;ARRAY WITH TERMINATING CHARACTER.
  ;val2 = ($-val1)/2
.code
main PROC        
  ;mov ax, 0
  ;mov ax, val1[val2]

  mov si, offset val1    ;SI POINTS TO VAL1.
  mov ax, '$'            ;TERMINATING CHARACTER.
repeat:
  cmp [ si ], ax
  je  dollar_found       ;IF [ SI ] == '$'
  add si, 2              ;NEXT WORD IN ARRAY.
  jmp repeat

dollar_found:  
  sub si, 2              ;PREVIOUS WORD.
  mov ax, [ si ]         ;FINAL WORD!

  Call WriteDec
  Call DumpRegs
 exit
main ENDP
END main

来源：https://stackoverflow.com/questions/37073017/x86-assembly-irvine32-get-last-element-of-an-array

标签

arrays

assembly

x86

irvine32