问题
I have a very simple problem that i need to achieve. First you enter a string, the first output should copy the last letter of the string and replace the first letter of the string and then the last letter should be replace with the first letter. The second output should capitalize the first letter of the string. I've already did the second output, my problem now is the first output. Please see the expected result below.
Expected Result
Enter string: jon jones
son jonej
Jon jones
Current Code
.MODEL SMALL
.STACK 100H
.DATA
INPUT_STRING DB 10,13,"Enter string: $"
USER_INPUT_STRING DB 80 DUP('$')
BREAKLINE DB 10, 13, "$"
.CODE
MOV AX, @DATA
MOV DS, AX
LEA DX,INPUT_STRING
MOV AH,09H
INT 21H
LEA DX, USER_INPUT_STRING
MOV AH, 0AH
INT 21H
LEA DX, BREAKLINE
MOV AH, 09H
INT 21H
SUB USER_INPUT_STRING + 2, 32 ;Capitalize
MOV AH, 02H
INT 21H
LEA DX, BREAKLINE
MOV AH, 09H
INT 21H
LEA DX, USER_INPUT_STRING + 2 ;Output of capitalize
MOV AH, 09H
INT 21H
LEA DX, BREAKLINE
MOV AH, 09H
INT 21H
MOV AH, 4CH
INT 21H
END
allowed commands
mov, lea, int, inc, dec, add, sub, proc, re, db
回答1:
The buffer of Int 21/AH=0Ah has three parts: size, length, string. The size is the maximal size of the string and must be initialized.
Change
USER_INPUT_STRING DB 80 DUP('$')
to
USER_INPUT_STRING DB 80, 0, 80 DUP('$')
Consider, that the string starts at USER_INPUT_STRING + 2
. There is its first character. After you have entered the string, you will find the length of the string you have entered at USER_INPUT_STRING + 1
, in this case 09h
. So, you'll find the last character of the entered string at USER_INPUT_STRING + 2 + (9 - 1)
. Use registers to swap the values at those memory addresses:
.MODEL SMALL
.STACK 100H
.DATA
INPUT_STRING DB 13,10,"Enter string: $"
USER_INPUT_STRING DB 80, 0, 80 DUP('$')
BREAKLINE DB 13, 10, "$"
.CODE
MOV AX, @DATA
MOV DS, AX
LEA DX,INPUT_STRING
MOV AH,09H
INT 21H
LEA DX, USER_INPUT_STRING
MOV AH, 0AH
INT 21H
LEA DX, BREAKLINE
MOV AH, 09H
INT 21H
MOV AL, USER_INPUT_STRING + 2
XOR CX, CX
MOV CL, USER_INPUT_STRING + 1
MOV BX, OFFSET USER_INPUT_STRING + 2
ADD BX, CX
DEC BX
MOV AH, [BX]
MOV [BX], AL
MOV USER_INPUT_STRING + 2, AH
LEA DX, USER_INPUT_STRING + 2
MOV AH, 09H
INT 21H
LEA DX, BREAKLINE
MOV AH, 09H
INT 21H
MOV AX, 4C00H
INT 21H
END
The only way to avoid the square brackets, I see in the use of LODSB
and MOVSB
:
.MODEL SMALL
.STACK 100H
.DATA
INPUT_STRING DB 13,10,"Enter string: $"
USER_INPUT_STRING DB 80, 0, 80 DUP('$')
BREAKLINE DB 13, 10, "$"
.CODE
main PROC
MOV AX, @DATA
MOV DS, AX
MOV ES, AX
LEA DX,INPUT_STRING
MOV AH,09H
INT 21H
LEA DX, USER_INPUT_STRING
MOV AH, 0AH
INT 21H
LEA DX, BREAKLINE
MOV AH, 09H
INT 21H
CALL swap
LEA DX, USER_INPUT_STRING + 2
MOV AH, 09H
INT 21H
LEA DX, BREAKLINE
MOV AH, 09H
INT 21H
MOV AX, 4C00H
INT 21H
main ENDP
swap PROC
LEA DI, USER_INPUT_STRING + 2
MOV AL, USER_INPUT_STRING + 1
MOV AH, 0
SUB AL, 1
ADD DI, AX
MOV SI, DI
LODSB
MOV AH, USER_INPUT_STRING + 2
XCHG AL, AH
STOSB
MOV USER_INPUT_STRING + 2, AH
RET
swap ENDP
END main
In EMU8086 and in TASM (not in MASM) you can also use the special preprocessor arithmetic: USER_INPUT_STRING + 2 + BX - 1
:
.MODEL SMALL
.STACK 100H
.DATA
INPUT_STRING DB 13,10,"Enter string: $"
USER_INPUT_STRING DB 80, 0, 80 DUP('$')
BREAKLINE DB 13, 10, "$"
.CODE
main PROC
MOV AX, @DATA
MOV DS, AX
MOV ES, AX
LEA DX,INPUT_STRING
MOV AH, 09H
INT 21H
LEA DX, USER_INPUT_STRING
MOV AH, 0AH
INT 21H
LEA DX, BREAKLINE
MOV AH, 09H
INT 21H
CALL swap
LEA DX, USER_INPUT_STRING + 2
MOV AH, 09H
INT 21H
LEA DX, BREAKLINE
MOV AH, 09H
INT 21H
MOV AX, 4C00H
INT 21H
main ENDP
swap PROC
MOV AH, USER_INPUT_STRING + 2
MOV BL, USER_INPUT_STRING + 1
MOV BH, 0
MOV AL, USER_INPUT_STRING + 2 + BX - 1
MOV USER_INPUT_STRING + 2, AL
MOV USER_INPUT_STRING + 2 + BX - 1, AH
RET
swap ENDP
END main
All programs change the content of the string. To undo this, you have to call swap
a second time. It is up to you to incorporate the second part.
回答2:
Just for fun, here's anoptimized version inspired by rkhb's answer. Somewhere between this and rkhb's would be a simpler version with fewer instructions but without making it hard to follow.
DOS int 21h / AH=0Ah
takes a pointer to a struct, not just a flat buffer. The first 2 bytes are a buffer-size and length. (The DOS function stores the length instead of returning it in AL for some reason). Documentation: http://spike.scu.edu.au/~barry/interrupts.html#dosbuf
You should make your buffer larger than the max length you specify so it's still $
-terminated after a max-length input. Apparently a CR is left in the buffer after the user input, but not CR LF, and IDK if it's guaranteed if the user input hit the max length instead of ending with the user hitting return. The input-size left in the buffer by DOS excludes the CR. Your version using the pre-filled buffer of $
will end up printing CR CR LF because it uses the terminator, not the length, but that's probably not a problem for output to the screen.
You can make it another 2 bytes larger than so you have room to append a CRLF yourself, instead of needing to print that with a separate call. Since you get a length that doesn't include the CR, it's easy to overwrite it and leave just CR LF after the user input, before the first $
.
Optimizations:
First of all, you could make this a .com
executable so all your segment registers are already set appropriately; use .model tiny
. (Except then you're pretty much guaranteed to get a self-modifying-code pipeline stalls from having the data right next to the code.)
You also don't need LEA to put static addresses in registers. mov dx, OFFSET INPUT_STRING
is 1 byte shorter. There's no [disp8]
addressing mode with no register.
After restoring the first character back to lower-case with add [mem], 20h
or or [mem], 20h
, you're back to your original situation. (Assuming that the input character was in fact lower case, not originally upper case).
You want to load the length so you know where the last character is. You're stuck with 8086 so you can't just use movzx cx, byte ptr [bx]
to zero-extend it into a 16-bit register; you have to zero a 16-bit register then merge a byte into the low half. (Or other possibilities).
You're also going to want the pointer in a register at some point, so you might as well do that early (in SI or DI or BX) so you can use more compact addressing modes, or even use mov dx, bx
instead of mov dx, OFFSET USER_INPUT_STRING`. Although with addresses only being 16-bit, it's not worth spending extra instructions on that.
Here's the interesting part; starting with reading user input.
.DATA
max_user_len = 80 ; assemble time constant, not stored in memory by this line
; +3 extra $ chars means we can append a CRLF and *still* have it $-terminated
; after a max-length user input
input_buf DB max_user_len, 0, max_user_len+3 DUP('$')
; notice that the end of the buffer is far below 256 bytes into the .data segment
; which makes address math with 8-bit registers safe.
; this is a hack which can break if you link more things together and have a bigger data segment.
prompt_string DB 13,10,"Enter string: $"
BREAKLINE DB 13, 10, "$"
.CODE
main:
... prompt and stuff same as before
; then the interesting part
mov DX, OFFSET input_buf
MOV AH, 0AH ; DOS buffered input
INT 21H
;;; you may need to print a CR LF here, according to Michael Petch's comment
;;; pressing enter doesn't echo the newline
;; load from the input
mov si, OFFSET input_buf + 2 ; pointer to first data char
mov cx, [si-1] ; CL = length, CH=original first char
;; append a CR LF to the end of the buffer.
mov bx, si
add bl, cl ; HACK: the whole buffer is in the first 256 bytes of the segment and thus we don't need carry propagation into the high half
; BX points to one past the end user input.
mov word ptr [bx], 0A0Dh ; append CR LF = 0D 0A = little-endian 0x0A0D. Still $-terminated because we have extra padding.
;; first output, including a CR LF
; SI still points at the first char
and byte ptr [si], ~20h ; clear the ASCII-lowercase bit
mov dx, si
mov ah, 09h
int 21h ; DOS buffered output: printing just the text.
;; swap first and last
;; We still have the original first char already loaded (CH)
;; just need to load the last char and then store both to opposite places.
mov al, [bx-1] ; last char before CR LF $
mov [bx-1], ch ; replace it with orig first char
mov [si], al ; store last char
; DX = output buffer, AH = 0Ah from last time
; second output
int 21h
; exit
mov ax, 4c00H
int 21h
ret
Untested, there might be an off-by-1 error in there somewhere.
This is mostly "optimized" for 8086 where code-size is most important. Otherwise I might copy and modify CH and store that, instead of using a memory destination and
that will have to reload that byte from cache/memory again.
xchg [bx-1], ch
would be even smaller, but the implicit lock
prefix can make it slower.
来源:https://stackoverflow.com/questions/57643126/interchange-letters-from-string-in-assembly-language-8086