LDR and EQU in ARM Assembly

社会主义新天地 提交于 2021-01-07 02:33:20

问题


This my assembly code.

a EQU 0x20000000 
b EQU 0x20000004 
c EQU 0x20000008 
LDR R4, =a 
LDR R0, [R4] 
LDR R4, =b 
LDR R1, [R4] 
LDR R4, =c 

I had two questions. what after LDR R0, [R4] what goes inside R[4]? 0x20000000 or contents of the memory at 0x20000000?
And second, after the last line, what goes inside R4? c or contents of memory at 0x20000008?

I've searched the internet about EQU and LDR, my own guess is that to all to questions, it goes the contents of the memory, but I'm confused because my friend told me no, the hex numbers I mentioned above is stored in R[4].


Update: Sorry I meant to say: what after LDR R0, [R4] what goes inside R0? 0x20000000 or contents of the memory at 0x20000000?

what goes inside R4? 0x20000008 or contents of memory at 0x20000008?


回答1:


First off there is no need to search the whole internet, go right to the source, the arm documentation covers the instruction set. Based on the instructions this is either arm or thumb code but not aarch64, so assuming arm...

The next thing is that EQU is a directive and not an instruction which should be fairly obvious, how would a processor use variables and such an instruction in assembly language? They don't.

The next thing is that assembly language is defined by the assembler, the tool you are using not the target (instruction set). This looks to be possibly one of ARM's assemblers. Many folks use gnus tools, either will work. I have gnu's so the assembly language changes to this:

.EQU a, 0x20000000
.EQU b, 0x20000004
.EQU c, 0x20000008
LDR R4, =a
LDR R0, [R4]
LDR R4, =b
LDR R1, [R4]
LDR R4, =c

for the gnu assembler, and I assemble and then disassemble and get this:

arm-none-eabi-as so.s -o so.o
arm-none-eabi-objdump -D so.o

so.o:     file format elf32-littlearm


Disassembly of section .text:

00000000 <.text>:
   0:   e3a04202    mov r4, #536870912  ; 0x20000000
   4:   e5940000    ldr r0, [r4]
   8:   e3a04242    mov r4, #536870916  ; 0x20000004
   c:   e5941000    ldr r1, [r4]
  10:   e3a04282    mov r4, #536870920  ; 0x20000008

The =a type syntax is pseudo code and as a result not real instruction stuff and as a result varies by assembler. As pointed out elsewhere on this page support for that feature is tool dependent and some tools handle it differently than others. Gnu assembler, seems the most feature rich....let me digress...The registers are 32 bits, the instructions are 32 bit it is not possible to have a load immediate instruction with a 32 bit immediate with a 32 bit (fixed-length) instruction there are no other bits left. Different (fixed-length) instruction sets solve this different ways, arm has its solution which is interesting and which varies between arm, thumb and thumb2 extensions, for arm you can have up to a group of 8 non-zero bits that can be rotated an even number of bits to make the immediate....gnu assembler will do its best to pick mov or mvn if it can otherwise it creates a pc relative load and places the value in a nearby pool.

In your case these constants all worked fine so mov was substituted for the ldr = pseudo instruction.

   0:   e3a04202    mov r4, #536870912  ; 0x20000000
   4:   e5940000    ldr r0, [r4]
   8:   e3a04242    mov r4, #536870916  ; 0x20000004
   c:   e5941000    ldr r1, [r4]
  10:   e3a04282    mov r4, #536870920  ; 0x20000008

As you should have already read in the arm documentation (you simply cannot start learning/reading assembly without the documentation from the processor vendor handy). ldr r0,[r4] means take the bits in r4 and use them as an address (prior instruction placed 0x20000000 in r4 so for this instruction 0x20000000 becomes an address), read (load) from that address and place the result (bits that come back) in r0.

For demonstration purposes

.EQU a, 0x20000004
.EQU b, 0x20000200
.EQU c, 0x20000008
LDR R4, =a
LDR R0, [R4]
LDR R4, =b
LDR R1, [R4]
LDR R4, =c

gives

00000000 <.text>:
   0:   e3a04242    mov r4, #536870916  ; 0x20000004
   4:   e5940000    ldr r0, [r4]
   8:   e59f4004    ldr r4, [pc, #4]    ; 14 <a-0x1ffffff0>
   c:   e5941000    ldr r1, [r4]
  10:   e3a04282    mov r4, #536870920  ; 0x20000008
  14:   20000200

The 0x20000004 can be rotated you can rotate 0x00000042 for example around an even number of bits and that looks like what they did. But 0x20000200 cannot be created based on the rules for the mov nor mvn instructions so a pc relative load was used with the value 0x20000200 nearby. Since this is not complete code the processor would slam into that data as if it were an instruction and go until it ultimately crashed or got luckly and stuck in a loop. For real code that pool would be placed after an unconditional branch of some sort and/or where you tell it the pool is based on assembler specific assembly language directives.

Technically an arm assembler's assembly language does not have to support the ldr r0,[r4] syntax the assembler authors are free to do whatever they want ldr r0,(r4), ldr [r4],r0 loadw r0,0(r4), bob pickle,[onion], so long as it generates the right machine code, it is an assembly language that generates arm instructions and thus an arm assembly language. So far all the assemblers I have used support the ldr r0,[r4] syntax in that order, but I don't think all support the =address thing and not all support it in the same way based on experience here at SO with questions and other folks posts.

You can do the load yourself, but you also get into assembly language specific differences:

.EQU a, 0x20000000
.EQU b, 0x20000000
.EQU c, 0xFFFFFF20
LDR R4, avalue
LDR R0, [R4]
LDR R4, =b
LDR R1, [R4]
LDR R4, =c
avalue: .word a

Disassembly of section .text:

00000000 <avalue-0x14>:
   0:   e59f400c    ldr r4, [pc, #12]   ; 14 <avalue>
   4:   e5940000    ldr r0, [r4]
   8:   e3a04202    mov r4, #536870912  ; 0x20000000
   c:   e5941000    ldr r1, [r4]
  10:   e3e040df    mvn r4, #223    ; 0xdf

00000014 <avalue>:
  14:   20000000

And this forced the pc relative load because that is what I asked for. Within the arm targets some assembly languages want colons to mark labels others don't you can see the EQU differences (EQU is similar to a simple #define in C #define a 0x2000000 but unlike C you don't use it for more complicated macros there may be an assembler specific portion of the language for macros). Most sane assemblers use colon to mark a comment, gnu assembler for arm doesn't (note that the gnu assembler assembly languages don't all have to have common rules/features across different targets, assume each target was written and maintained by different folks and as a result each language is different with respect to simple things like comments and other non-instruction stuff. If they overlap, so be it).

This is unlinked so the disassembly starts at 0 for this tool, once linked then this disassembly output would have different addresses, but being position independent (the pc relative load is ..... pc relative) the machine code would remain as is (for these examples).


EQU is a directive similar to define in C, it allows you to in this case take a value 0x20000000 and instead of typing that type the variable or name or string a and a pre-processor will search and replace the obvious uses of the letter a with 0x20000000 (it is not going to replace the a in add r0,r1,r2 of course).

LDR is a family of instructions LoaD Register. An address located in a register, possible offsets and a destination register are provided. LDR R0,[R4] means r4 contains the address (0x20000000 in your case) and r0 is the destination register that holds the result of the load.

ARM makes cores not chips, there is no reason to assume from what you have provided that 0x20000000 is "memory" it could be a peripheral, the processor cares not, it is just an address it puts on the bus, the chip vendor knows what that address means, and returns a value on the bus. The processor has no idea if it were memory or something else.

As written assuming this is not some strange assembler that mangles things, the first ldr does a load from address 0x20000000 and puts the result in r0.




回答2:


1st 3 lines define addresses - these could be stored in a local pool in memory, but because of how simple they are, the assembler might use a couple MOV instructions to assemble them for you upon first use.

The 4th line sets R4 to the value of the macro a -> 0x20000000 (probably via a MOV with shift operation, but it may load from a memory pool if the assembler wanted to for some reason.)

The 5th line reads from memory at 0x20000000 and places the value read from memory into R0.

The 6th line loads R4 with 0x20000004 (probably becomes a MOV and ADD)

The 7th line reads from memory at 0x20000004 and places the value into R1 The last line loads R4 with 0x2000008 (probably becomes a MOV and ADD)

In short, the LDR R4, = instructions only load the register with the addresses stored in the EQU statements. The R0/R1 loads that follow are what loads from the addresses that they point at.

Now if someone who really understood assembly were writing this, it would probably become just 2 instructions: a MOV of 0x2 with a shift into R4 and a LDM instruction to handle loads and increments.



来源:https://stackoverflow.com/questions/62563750/ldr-and-equ-in-arm-assembly

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!