Explanation of C buffer overflow

心已入冬 提交于 2021-02-09 07:00:17

问题


I try to understand buffer overflows. This is my code:

#include <stdio.h>

int main() 
{
    char buf[5] = { 0 };
    char x = 'u';

    printf("Please enter your name: ");
    gets(buf);

    printf("Hello %s!", buf);

    return 0;
}

The buf array is of size five and initialized with 0es. So (with null termination) I have space for four characters. If I enter five characters (stack for example), I overwrite the null termination character and printf should print "Hello stacku!" because of the succeeding variable x. But this isn't the case. It simply prints "stack". Could someone please explain why?


回答1:


Local variables are generally created on the stack. In most implementations, stacks grow downward, not upward, as memory is allocated. So, it is likely that buf is at a higher address than x. That's why, when buf overflows, it does not overwrite x.

You might be able to confirm this by writing buf[-1]='v';printf("%c\n",x); although that might be affected by padding. It may also be instructive to compare the addresses with printf("%i\n",buf - &x); -- if the result is positive, then buf is at a higher address than x.

But this is all highly implementation dependent, and can change based on various compiler options. As others have said, you shouldn't rely on any of this.




回答2:


The short explanation is, just because you declared 'x' on the source line after 'buf', that doesn't mean the compiler put them next to each other on the stack. With the code shown, 'x' isn't used at all, so it probably didn't get put anywhere. Even if you did use 'x' somehow (and it would have to be a way that prevents it being stuffed into a register), there's a good chance the compiler will sort it below 'buf' precisely so that it does not get overwritten by code overflowing 'buf'.

You can force this program to overwrite 'x' with a struct construct, e.g.

#include <stdio.h>

int main() 
{
    struct {
        char buf[5];
        char x[2];
    } S = { { 0 }, { 'u' } };

    printf("Please enter your name: ");
    gets(S.buf);

    printf("Hello %s!\n", S.buf);
    printf("S.x[0] = %02x\n", S.x[0]);

    return 0;
}

because the fields of a struct are always laid out in memory in the order they appear in the source code.1 In principle there could be padding between S.buf and S.x, but char must have an alignment requirement of 1, so the ABI probably doesn't require that.

But even if you do that, it won't print 'Hello stacku!', because gets always writes a terminating NUL. Watch:

$ ./a.out 
Please enter your name: stac
Hello stac!
S.x[0] = 75

$ ./a.out 
Please enter your name: stack
Hello stack!
S.x[0] = 00

$ ./a.out 
Please enter your name: stacks
Hello stacks!
S.x[0] = 73

See how it always prints the thing you typed, but x[0] does get overwritten, first with a NUL, and then with an 's'?

(Have you already read Smashing the Stack for Fun and Profit? You should.)


1 Footnote for pedants: if bit-fields are involved, the order of fields in memory becomes partially implementation-defined. But that's not important for purposes of this question.




回答3:


As the other answer pointed out, it's not at all guaranteed that x will sit immediately after buf in memory. But even if it did: gets is going to overwrite it. Remember: gets has no way of knowing how big the destination buffer is. (That's its fatal flaw.) It always writes the entire string it reads, plus the terminating \0. So if x happens to sit immediately after buf, then if you type a five-character string, printf is likely to print it correctly (as you saw), and if you were to inspect x's value afterwards:

printf("x = %d = %c\n", x, x);

it would likely show you that x was 0 now, not 'U'.

Here's how the memory might look initially:

     +---+---+---+---+---+
buf: |   |   |   |   |   |
     +---+---+---+---+---+

     +---+
  x: | U |
     +---+

So after you type "stack", it looks like this:

     +---+---+---+---+---+
buf: | s | t | a | c | k |
     +---+---+---+---+---+

     +---+
  x: |\0 |
     +---+

And if you were to type "elephant" it would look like this:

     +---+---+---+---+---+
buf: | e | l | e | p | h |
     +---+---+---+---+---+

     +---+
  x: | a | n   t  \0
     +---+

Needless to say, those three characters n, t, and \0 are likely to cause even more problems.

This is why people say not to use gets, ever. It cannot be used safely.



来源:https://stackoverflow.com/questions/52003862/explanation-of-c-buffer-overflow

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!