问题
According to "Storage for Short Term", Chapter 8 in "Assembly Language Step by Step" (3rd Edition):
The stack should be considered a place to stash things for the short term. Items stored on the stack have no names, and in general must be taken off the stack in the reverse order in which they were put on. Last in, first out, remember. LIFO!
However, according to my knowledge, C compilers use the stack for basically everything. Does that mean that the stack is the best way of storing variables, both short term and long term? Or is there a better way?
The alternatives that I can think of are:
- Heap, but that's slow.
- Static variables, but that's going to last the entire lifetime of the program, which might waste lots of memory.
回答1:
C compilers use the stack for basically everything. Well not really, there are some popular instruction sets that are stack heavy because the do or didnt have a lot of registers. So it is in part the design of the instruction set. A sane compiler design is going to have a calling convention, what are the rules for passing in parameters and for returning information. And some of those calling conventions, with a lot of registers in the ISA or not may be stack heavy or may use some registers and then rely on the stack when there are many parameters.
Then you get into what programmers are taught in school, that things like globals are bad. Now you have habits of stack heavy programmers, add to that notions of functions should be small, fit on a printed page of 12 point font or fit on your screen, etc. This creates a ton of functions all passing more and more parameters through many nested functions, sometimes it is a pointer to one structure high up in the nesting or the same value or variations of it passed over and over again. Creating a massive overuse of the stack, some variables not only live a very long time there may be dozens or hundreds of copies of that variable due to the depth of the nesting of functions and the use of the stack for passing or storing variables. Has absolutely nothing to do with a particular programming language but in part the educators opinions (that in some cases have to do with making it easier to grade papers and not necessarily making better programs) and habits.
If you have enough registers and you allow their use in the calling convention, and you have an optimizer, you can at times greatly reduce the amount of stack usage, the programmer gets involved here with their habits still and can still cause unnecessary stack consumption, and nesting that cant be inlined can still cause duplicates of items on the stack or structures or items that remain on the stack in place for the entire life of the program.
Globals and static locals which I like to call local globals are in .data not on the stack. There are programmers that will create variables or structs at the main() level that are passed on down through every level of nesting, costing consumption of the parameter passing which could have been used more efficiently if it is a stack heavy calling convention, even with pass by reference you are still burning a pointer every level, where a static global would have been far cheaper, a local global would have still cost you the same amount as the not static local at that top level. you cant simply state that globals or static locals cost you more, I would argue they are far less consumption, depends on your programming habits and choice of variables, if you create a new variable with a new name for every little possible thing sure you could get into trouble. But for example when you want to do microcontroller work or other embedded work where you are extremely constrained on resources, using only globals for example gives you a far better chance of success, your memory usage is almost fixed, you still have storage for the return address for functions that are nested and dont get inlined. that is a bit extreme, with practice you can use locals that you have a pretty good chance of being optimized away into registers and not use the stack. It is very programmer, processor, and compiler dependent as to whether heavy local use or heavy global use actually consumes less memory. heavy local use has the potential for only being temporary use, but for constrained systems, the analysis required to insure you dont crash the stack into the program or heap takes a lot more work to insure safety, every line of code you add or remove can have dramatic affects on the stack usage when heavy on the local variables. Any scheme to detect stack usage instantly costs you lots of resources burning up more of that space without adding any new application high level code.
Now you are reading an assembly language book. Not a compiler book. Compiler programmers habits are a bit more lets say confined or controlled or some other word. In order to debug the output and keep your sanity you see compilers often mess with the stack up front and at the end, a stack frame basically. You dont often see them adding and removing things all through the function causing the offsets to change for the same item, and or burning yet another register as a frame pointer so that you can mess with the stack mid function but throughout the function some local variable x or passed in variable y remains at the same offset to that stack pointer or frame pointer throughout. assembly language programmers may choose to do that too, but may also choose to just use the stack as a relatively short term solution.
So take this for example, code that is written to force the compiler to use the stack:
unsigned int more_fun ( unsigned int );
unsigned int fun ( unsigned int a )
{
return(more_fun(a)+a+5);
}
creating
00000000 <fun>:
0: e92d4010 push {r4, lr}
4: e1a04000 mov r4, r0
8: ebfffffe bl 0 <more_fun>
c: e2844005 add r4, r4, #5
10: e0840000 add r0, r4, r0
14: e8bd4010 pop {r4, lr}
18: e12fff1e bx lr
the stack frame approach is used, sort of, up front push a register on the stack, and on the back end free it up/restore it. then use that register mid function for local storage. the calling convention here dictates that r4 has to be preserved, so the next function down preserves and all the nesting below so that when we get back to this function r4 is how we left it (r0 which is what the parameter comes in on and returns in this case) is volatile each function can destroy it.
Although it violates the current convention for this instruction set you could have instead
push {lr}
push {r0}
bl more_fun
add r0,r0,#5
pop {r1}
add r0,r0,r1
pop {lr}
bx lr
Is one way cheaper than the other, sure the two register stack push and pop is cheaper than four individual ones, for this instruction set we cant get around doing two adds, we use the same number of registers. The compilers approach in this case is "cheaper". But what if a function was written that didnt have to use the stack for temporary storage (depending on the instruction set)
unsigned int more_fun ( unsigned int );
unsigned int fun ( unsigned int a )
{
return(more_fun(a)+5);
}
producing 0: e92d4010 push {r4, lr} 4: ebfffffe bl 0 8: e8bd4010 pop {r4, lr} c: e2800005 add r0, r0, #5 10: e12fff1e bx lr
and then you tell me, but it did. Well partly calling convention, and partly because if the bus is 64 bits wide, which it often is for an ARM now, or even if not, you are adding one clock to a transaction that takes many to hundreds of clocks for that additional register, not a big cost, if 64 bits wide then a single register push and pop actually costs you doesnt save you, likewise staying aligned on a 64 bit boundary when you have a 64 bit wide bus, also saves you a lot. The compiler in this case chose r4, r4 is not being preserved here it is simply some register the compiler chose to keep the stack aligned as you see in other stackoverflow questions related to this, sometimes the compiler uses r3, or other registers, in this case it chose r4.
But beyond that stack alignment and convention (I could dig up an older compiler to show r4 not being there just lr). This code did not require the input parameter to be preserved for math to be done after the nested function call, after it goes into more_fun() the variable a can be discarded.
As an assembly language programmer, you are probably wanting to strive to use registers a lot, I guess it depends on the instruction set and your habits an x86 CISC where you can use memory operands directly in a lot of the instructions perhaps you develop a habit of that despite the performance cost. but if you strive to use registers as much as you can you will eventually fall of the cliff and have all the registers used and need one more, so you do what the book is telling you to do
push {r0}
ldr r0,[r2]
ldr r1,[r0]
pop {r0}
or something like that, ran out of registers, needed to do a double indirect. Or maybe you need an intermediate variable and you simply have none left to spare, so you temporarily use the stack
push {r0}
add r0,r1,r2
str r0,[r3]
pop {r0}
With compiled languages stack use vs some alternative first off starts with the processor design, is the instruction set starved of general purpose registers, does the instruction set use the stack by design for function call instructions and return instructions and interrupts and interrupt returns or do they use a register and let you choose if you need to save it on the stack. Does the instruction set force you into stack usage basically or is it a option. Next programming habits be they taught or developed on your own, can result in heavy or lighter stack use, too many functions, too much nesting, the return addresses alone are going to take little bytes on the stack each call, add heavily local variable use, and that can chew a little more or explode it depending on function size, number of variables (variable size) and the code in the functions. If you dont use the optimizer then you will get massive stack explosion, you dont have a fall of the cliff effect of adding one more line to a function goes from little to no stack use to a lot of stack use, because you pushed the register usage over that cliff, by one or more by adding that one more line. unoptimized the stack consumption is heavy but more linear. Using registers is the best way to reduce memory consumption, but takes lots of practice at coding and looking at the compiler output and hoping that the next compiler works the same-ish way, they often do but sometimes they dont. Still you can write your code to be more conservative of memory use and still get the task done. (using smaller variables like using a char instead of an int DOES NOT necessarily save you, for 16, 32 and 64 bit register sized instruction sets it sometimes costs you extra instructions to sign extend or mask off the rest of the register. Depends on the instruction set and your code) And then there are globals, which for some reason are frowned upon, hard to read? that is silly. They have pros and cons the pros are your consumption is far more controlled, the cons are yes, if you use a lot of variables, dont re-use variables you will consume a lot more, and they are there for the life of the program, they dont free up like non-static locals. static locals are just globals with limited scope, only use them when you are wanting a global but afraid of being shunned for it, or have a very specific reason, which there is a short list mostly related to recursion.
How is the heap slow? Ram is ram generally, if your variable is on the stack or on the heap it takes the same loads and stores to get at it, the cache hits and misses, although you can try to manipulate, but still they sometimes hit sometimes miss. Some processors have special on chip ram for stack sure, but those are not the kinds of general purpose processors we see today, those stacks are generally pretty small. Or some embeded/bare metal designs you may put the stack on different ram than the the .data or heap, because you want to use it and have it have the fastest memory. But take a program on the machine you are reading this, the program, the stack and the .data/heap are likely the same slow dram space, with some caching trying to make it faster, but not always. the "heap" which is a compiled/operating system use of memory anyway, has the problem of allocation and freeing, but once allocated then the performance is the same as .text and .data and the stack for a lot of the target platforms we use. using the stack you are basically doing a malloc and free with less overhead than making a system call. But you could still use the heap in an efficient way just like compilers used the stack above, one instruction to push and pop two things, saving several to dozens to hundreds of clock cycles. you could malloc and free larger things less often. And folks do that when it doesnt make sense to use the stack (because of the size of the struct or array or array of structs).
回答2:
Stack is usually used for pushing arguments to a function call, store local variables of a function and it also keep tracks of return address (the instruction where it will start execution after returning from current function). But, how a function call will be implemented depends on compiler implementation and calling conventions.
C compilers use the stack for basically everything
That is not true. C compiler does not put a global and static variables in stack.
Does that mean that the stack is the best way of storing variables, both short term and long term?
Stack should be used for variables that will not be used after the current function returns. Yes, you can use stack for long term too. local variables in main() will last through out the lifetime of the program. Also keep in mind that stack for each program is limited.
Heap, but that's slow.
That is because it requires some management at runtime. If you want to allocate in heap in assembly it you will have to manage the heap yourself. In high level languages like C, C++ the language runtime and OS manages the heap. You will not have that in assembly.
来源:https://stackoverflow.com/questions/41842938/should-i-use-the-stack-for-long-term-variable-storage