C optimisation of string literals

旧时模样 提交于 2019-12-17 02:37:51

问题


just been inspecting the following in gdb:

char *a[] = {"one","two","three","four"};
char *b[] = {"one","two","three","four"};
char *c[] = {"two","three","four","five"};
char *d[] = {"one","three","four","six"};

and i get the following:

(gdb) p a
$17 = {0x80961a4 "one", 0x80961a8 "two", 0x80961ac "three", 0x80961b2 "four"}
(gdb) p b
$18 = {0x80961a4 "one", 0x80961a8 "two", 0x80961ac "three", 0x80961b2 "four"}
(gdb) p c
$19 = {0x80961a8 "two", 0x80961ac "three", 0x80961b2 "four", 0x80961b7 "five"}
(gdb) p d
$20 = {0x80961a4 "one", 0x80961ac "three", 0x80961b2 "four", 0x80961bc "six"}

I'm really surprised that the string pointers are the same for equivalent words. I would have thought each string would have been allocated its own memory on the stack regardless of whether it was the same as a string in another array.

Is this an example of some sort of compiler optimisation or is it standard behaviour for string declaration of this kind?


回答1:


It's called "string pooling". It's optional in Microsoft Compilers, but not in GCC. If you switch off string pooling in MSVC, then the "same" strings in the different arrays would be duplicated, and have different memory addresses, and so would take up an extra (unnecessary) 50 or so bytes of your static data.

EDIT: gcc prior to v 4.0 had an option, -fwritable-strings which disabled string pooling. The effect of this option was twofold: It allowed string literals to be overwritten, and disabled string pooling. So, in your code, setting this flag would allow the somewhat dangerous code

/* Overwrite the first string in a, so that it reads 'xne'.  Does not */ 
/* affect the instances of the string "one" in b or d */
*a[0] = 'x';



回答2:


(I assume that your a, b, c and d are declared as local variables, which is the reason for your stack-related expectations.)

String literals in C have static storage duration. They are never allocated "on the stack". They are always allocated in global/static memory and live "forever", i.e. as long as the program runs.

Your a, b, c and d arrays were allocated on the stack. The pointers stored in these arrays point to static memory. Under these circumstances, there's nothing unusual about pointers for identical words being identical.

Whether a compiler will merge identical literals into one depends on the compiler. Some compilers even have an option that controls this behavior. String literals are always read-only (which is why it is a better idea to use const char * type for your arrays), so it doesn't make much difference whether they are merged or not, until you begin to rely on actual pointer values.

P.S. Just out of curiosity: even if these string literals were allocated on the stack, why would you expect identical literals to be "instantiated" more than once?



来源:https://stackoverflow.com/questions/11399682/c-optimisation-of-string-literals

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!