Why do (only) some compilers use the same address for identical string literals?

旧时模样 提交于 2019-12-02 15:41:44

This is not undefined behavior, but unspecified behavior. For string literals,

The compiler is allowed, but not required, to combine storage for equal or overlapping string literals. That means that identical string literals may or may not compare equal when compared by pointer.

That means the result of A == B might be true or false, on which you shouldn't depend.

From the standard, [lex.string]/16:

Whether all string literals are distinct (that is, are stored in nonoverlapping objects) and whether successive evaluations of a string-literal yield the same or a different object is unspecified.

The other answers explained why you cannot expect the pointer addresses to be different. Yet you can easily rewrite this in a way that guarantees that A and B don't compare equal:

static const char A[] = "same";
static const char B[] = "same";// but different

void f() {
    if (A == B) {
        throw "Hello, string merging!";
    }
}

The difference being that A and B are now arrays of characters. This means that they aren't pointers and their addresses have to be distinct just like those of two integer variables would have to be. C++ confuses this because it makes pointers and arrays seem interchangeable (operator* and operator[] seem to behave the same), but they are really different. E.g. something like const char *A = "foo"; A++; is perfectly legal, but const char A[] = "bar"; A++; isn't.

One way to think about the difference is that char A[] = "..." says "give me a block of memory and fill it with the characters ... followed by \0", whereas char *A= "..." says "give me an address at which I can find the characters ... followed by \0".

Whether or not a compiler chooses to use the same string location for A and B is up to the implementation. Formally you can say that the behaviour of your code is unspecified.

Both choices implement the C++ standard correctly.

It is an optimization to save space, often called "string pooling". Here is the docs for MSVC:

https://msdn.microsoft.com/en-us/library/s0s0asdt.aspx

Therefore if you add /GF to the command line you should see the same behavior with MSVC.

By the way you probably shouldn't be comparing strings via pointers like that, any decent static analysis tool will flag that code as defective. You need to compare what they point to, not the actual pointer values.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!