Interchange Letters From String in Assembly Language 8086

混江龙づ霸主 提交于 2019-12-06 02:21:17
rkhb

The buffer of Int 21/AH=0Ah has three parts: size, length, string. The size is the maximal size of the string and must be initialized.

Change

USER_INPUT_STRING     DB 80 DUP('$')

to

USER_INPUT_STRING     DB 80, 0, 80 DUP('$')

Consider, that the string starts at USER_INPUT_STRING + 2. There is its first character. After you have entered the string, you will find the length of the string you have entered at USER_INPUT_STRING + 1, in this case 09h. So, you'll find the last character of the entered string at USER_INPUT_STRING + 2 + (9 - 1). Use registers to swap the values at those memory addresses:

.MODEL SMALL
.STACK 100H
.DATA
    INPUT_STRING          DB 13,10,"Enter string: $"
    USER_INPUT_STRING     DB 80, 0, 80 DUP('$')
    BREAKLINE             DB 13, 10, "$"
.CODE
    MOV AX, @DATA
    MOV DS, AX

    LEA DX,INPUT_STRING
    MOV AH,09H
    INT 21H

    LEA DX, USER_INPUT_STRING
    MOV AH, 0AH
    INT 21H

    LEA DX, BREAKLINE
    MOV AH, 09H
    INT 21H

    MOV AL, USER_INPUT_STRING + 2
    XOR CX, CX
    MOV CL, USER_INPUT_STRING + 1
    MOV BX, OFFSET USER_INPUT_STRING + 2
    ADD BX, CX
    DEC BX
    MOV AH, [BX]
    MOV [BX], AL
    MOV USER_INPUT_STRING + 2, AH

    LEA DX, USER_INPUT_STRING + 2
    MOV AH, 09H
    INT 21H

    LEA DX, BREAKLINE
    MOV AH, 09H
    INT 21H

    MOV AX, 4C00H
    INT 21H
END

The only way to avoid the square brackets, I see in the use of LODSB and MOVSB:

.MODEL SMALL
.STACK 100H

.DATA
    INPUT_STRING          DB 13,10,"Enter string: $"
    USER_INPUT_STRING     DB 80, 0, 80 DUP('$')
    BREAKLINE             DB 13, 10, "$"

.CODE
main PROC
    MOV AX, @DATA
    MOV DS, AX
    MOV ES, AX

    LEA DX,INPUT_STRING
    MOV AH,09H
    INT 21H

    LEA DX, USER_INPUT_STRING
    MOV AH, 0AH
    INT 21H

    LEA DX, BREAKLINE
    MOV AH, 09H
    INT 21H

    CALL swap

    LEA DX, USER_INPUT_STRING + 2
    MOV AH, 09H
    INT 21H

    LEA DX, BREAKLINE
    MOV AH, 09H
    INT 21H

    MOV AX, 4C00H
    INT 21H
main ENDP

swap PROC
    LEA DI, USER_INPUT_STRING + 2
    MOV AL, USER_INPUT_STRING + 1
    MOV AH, 0
    SUB AL, 1
    ADD DI, AX
    MOV SI, DI
    LODSB
    MOV AH, USER_INPUT_STRING + 2
    XCHG AL, AH
    STOSB
    MOV USER_INPUT_STRING + 2, AH
    RET
swap ENDP

END main

In EMU8086 and in TASM (not in MASM) you can also use the special preprocessor arithmetic: USER_INPUT_STRING + 2 + BX - 1:

.MODEL SMALL
.STACK 100H

.DATA
    INPUT_STRING          DB 13,10,"Enter string: $"
    USER_INPUT_STRING     DB 80, 0, 80 DUP('$')
    BREAKLINE             DB 13, 10, "$"

.CODE
main PROC
    MOV AX, @DATA
    MOV DS, AX
    MOV ES, AX

    LEA DX,INPUT_STRING
    MOV AH, 09H
    INT 21H

    LEA DX, USER_INPUT_STRING
    MOV AH, 0AH
    INT 21H

    LEA DX, BREAKLINE
    MOV AH, 09H
    INT 21H

    CALL swap

    LEA DX, USER_INPUT_STRING + 2
    MOV AH, 09H
    INT 21H

    LEA DX, BREAKLINE
    MOV AH, 09H
    INT 21H

    MOV AX, 4C00H
    INT 21H
main ENDP

swap PROC
    MOV AH, USER_INPUT_STRING + 2
    MOV BL, USER_INPUT_STRING + 1
    MOV BH, 0
    MOV AL, USER_INPUT_STRING + 2 + BX - 1
    MOV USER_INPUT_STRING + 2, AL
    MOV USER_INPUT_STRING + 2 + BX - 1, AH
    RET
swap ENDP

END main

All programs change the content of the string. To undo this, you have to call swap a second time. It is up to you to incorporate the second part.

Just for fun, here's anoptimized version inspired by rkhb's answer. Somewhere between this and rkhb's would be a simpler version with fewer instructions but without making it hard to follow.


DOS int 21h / AH=0Ah takes a pointer to a struct, not just a flat buffer. The first 2 bytes are a buffer-size and length. (The DOS function stores the length instead of returning it in AL for some reason). Documentation: http://spike.scu.edu.au/~barry/interrupts.html#dosbuf

You should make your buffer larger than the max length you specify so it's still $-terminated after a max-length input. Apparently a CR is left in the buffer after the user input, but not CR LF, and IDK if it's guaranteed if the user input hit the max length instead of ending with the user hitting return. The input-size left in the buffer by DOS excludes the CR. Your version using the pre-filled buffer of $ will end up printing CR CR LF because it uses the terminator, not the length, but that's probably not a problem for output to the screen.

You can make it another 2 bytes larger than so you have room to append a CRLF yourself, instead of needing to print that with a separate call. Since you get a length that doesn't include the CR, it's easy to overwrite it and leave just CR LF after the user input, before the first $.


Optimizations:

First of all, you could make this a .com executable so all your segment registers are already set appropriately; use .model tiny. (Except then you're pretty much guaranteed to get a self-modifying-code pipeline stalls from having the data right next to the code.)

You also don't need LEA to put static addresses in registers. mov dx, OFFSET INPUT_STRING is 1 byte shorter. There's no [disp8] addressing mode with no register.


After restoring the first character back to lower-case with add [mem], 20h or or [mem], 20h, you're back to your original situation. (Assuming that the input character was in fact lower case, not originally upper case).


You want to load the length so you know where the last character is. You're stuck with 8086 so you can't just use movzx cx, byte ptr [bx] to zero-extend it into a 16-bit register; you have to zero a 16-bit register then merge a byte into the low half. (Or other possibilities).

You're also going to want the pointer in a register at some point, so you might as well do that early (in SI or DI or BX) so you can use more compact addressing modes, or even use mov dx, bx instead of mov dx, OFFSET USER_INPUT_STRING`. Although with addresses only being 16-bit, it's not worth spending extra instructions on that.

Here's the interesting part; starting with reading user input.

.DATA
max_user_len = 80           ; assemble time constant, not stored in memory by this line

    ; +3 extra $ chars means we can append a CRLF and *still* have it $-terminated
    ; after a max-length user input
    input_buf             DB max_user_len, 0,  max_user_len+3 DUP('$')
 ; notice that the end of the buffer is far below 256 bytes into the .data segment
 ; which makes address math with 8-bit registers safe.
 ; this is a hack which can break if you link more things together and have a bigger data segment.

    prompt_string         DB 13,10,"Enter string: $"
    BREAKLINE             DB 13, 10, "$"

.CODE
 main:

    ... prompt and stuff same as before
    ; then the interesting part

    mov DX,  OFFSET input_buf
    MOV AH, 0AH                  ; DOS buffered input
    INT 21H

;;; you may need to print a CR LF here, according to Michael Petch's comment
;;; pressing enter doesn't echo the newline

  ;; load from the input
    mov    si,  OFFSET input_buf + 2   ; pointer to first data char
    mov    cx, [si-1]                    ; CL = length,  CH=original first char

  ;; append a CR LF to the end of the buffer.
    mov    bx, si
    add    bl, cl                ; HACK: the whole buffer is in the first 256 bytes of the segment and thus we don't need carry propagation into the high half
     ; BX points to one past the end user input.
    mov  word ptr [bx], 0A0Dh    ; append CR LF = 0D 0A = little-endian 0x0A0D.  Still $-terminated because we have extra padding.

  ;; first output, including a CR LF
                                 ; SI still points at the first char
    and   byte ptr [si], ~20h    ; clear the ASCII-lowercase bit
    mov    dx, si
    mov    ah, 09h
    int    21h                   ; DOS buffered output: printing just the text.

  ;; swap first and last
  ;; We still have the original first char already loaded (CH)
  ;; just need to load the last char and then store both to opposite places.

    mov    al, [bx-1]            ; last char before CR LF $
    mov    [bx-1], ch            ; replace it with orig first char
    mov    [si], al              ; store last char

    ; DX = output buffer, AH = 0Ah  from last time
    ; second output
    int   21h 

    ; exit
    mov   ax, 4c00H
    int   21h

    ret

Untested, there might be an off-by-1 error in there somewhere.

This is mostly "optimized" for 8086 where code-size is most important. Otherwise I might copy and modify CH and store that, instead of using a memory destination and that will have to reload that byte from cache/memory again.

xchg [bx-1], ch would be even smaller, but the implicit lock prefix can make it slower.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!