Why did C never implement “stack extension”?

问题

Why did C never implement "stack extension" to allow (dynamically-sized) stack variables of a callee function to be referenced from the caller?

This could work by extending the caller's stack frame to include the "dynamically-returned" variables from the callee's stack frame. (You could, but shouldn't, implement this with alloca from the caller - it may not survive optimisation.)

e.g. If I wanted to return the dynamically-size string "e", the implementation could be:

--+---+-----+
  | a |  b  |
--+---+-----+

callee(d);

--+---+-----+---------+---+
  | a |  b  |  junk   | d |    
--+---+-----+---------+---+

char e[calculated_size];

--+---+-----+---------+---+---------+
  | a |  b  |  junk   | d |    e    |    
--+---+-----+---------+---+---------+

dynamic_return e;

--+---+-----+-------------+---------+
  | a |  b  |    waste    |    e    |
--+---+-----+-------------+---------+

("Junk" contains the return address and other system-specific metadata which is invisible to the program.)

This would waste a little stack space, when used.

The up-side is a simplification of string processing, and any other functions which have to currently malloc ram, return pointers and hope that the caller remembers to free at the right time.

Obviously, there is no point in added such a feature to C at this stage of its life, I'm just interested in why this wasn't a good idea.

回答1:

A new object may be returned through many layers of software. So the wasted space may be that from dozens or even hundreds of function calls.

Consider also a routine that performs some iterative task. In each iteration, it gets some newly allocated object from a subroutine, which it inserts into a linked list or other data structure. Such iterative tasks may repeat for hundreds, thousands, or millions of iterations. The stack will overflow with wasted space.

回答2:

Some objections to your idea. Some have been mentioned already in comments. Some come from the top of my head.

C doesn't have stacks or stack frames. C simply defines scopes and their life times and it is left to implementations as to how to implement the standard. Stacks and stack frames are really just the most popular way to implement some C semantics.
C doesn't have strings. C doesn't really have arrays as such. Well, it does have arrays, but as soon as you mention an array in an expression (e.g. a return expression), the array decays to a pointer to its first element. Returning a "string" or an array on the stack would involve significant impact on well established areas of the language.
C does have structs. However, you can already return a struct. I can't tell you how its done, because it is an implementation detail.
A problem with your specific implementation is that the caller has to know how big the "waste" is. Don't forget that the waste will include the stack frame of the callee but also the waste from any functions the callee calls either directly or indirectly. The returning convention will have to include information on the size of the waste and a pointer to the return value.
Stacks, as a rule, are quite limited compared to heap memory, particularly in applications that use threading. At some point the caller will need to move the returned array down into its own stack frame. If the array was merely a pointer to storage in the heap, this would be much more efficient, but then you've got the existing model.

回答3:

You have to realize, that the implementation of the stack is strongly dictated by the CPU and the OS kernel. The language does not have much say in this. Limitiations are, for instance:

The ret instruction of the X86 architecture expects the return address at the memory location stored in the stack pointer. Thus, there cannot be anything else on top (semantical top - usually this is the lowest address, as stacks tend to grow down). You could work around this, of course, but that would likely incur additional overheads which C programmers are not going to be willing to pay.
The stack pointer defines what part of the allocated stack memory is actually used. When control flow is changed asynchronously (hardware interrupt), the current CPU's registers are generally immediately stored to memory addresses below the stack pointer by the interrupt handler. This can happen at any time, even throughout most of the kernel code. Any data stored below the place where the stack pointer point to would be clobbered by this. (Well, technically, that's not fully correct, there is generally a "red zone" below the stack pointer to which the interrupt handlers may not write any data. But here we are getting very firmly into architectural design peculiarities.)
Destroying a stack frame is generally a single addition of a constant to the stack pointer. This is the fastest kind of instruction you can get, it will generally not require a single cycle to execute (it will execute in parallel to some memory access). If the stack frame has a dynamic size, the stack frame must be destroyed by loading the stack pointer from memory, and for that a base pointer must have been retained. That's a memory access with a significant latency, and another register that must be saved to be used. Again, this is overhead that's generally unnecessary.

Your proposal would definitely be implementable, but it would require some workarounds. And these workarounds would generally cost performance. Small bits of performance, but definitely measurable amounts. That's not what compiler/kernel developers want, and for good reason.

来源：https://stackoverflow.com/questions/48279268/why-did-c-never-implement-stack-extension

标签

stack