forbiddens in string literals in C

前端 未结 5 690
逝去的感伤
逝去的感伤 2020-12-11 10:00

In the K&R book page 104, I came across this statement:

char amessage[] = \"now is the time\"; //an array
char *pmessage = \"now is the time         


        
5条回答
  •  庸人自扰
    2020-12-11 10:29

    There is nothing inherently wrong with using pointers as arrays, unless those pointers point to constant data (and string literals are constant data). Although semantically incorrect, in the old days of no memory protection, pmessage[0] = 'n'; would have actually worked with unpredictable results (e.g. affecting all occurrences of the same literal within the program). On modern operating system this could not happen because of the memory protection in place. String literals and other constants are put in so-called read-only sections of the executable and when the executable is loaded in memory in order to create a process, the memory pages that contain the read-only sections are made to be read-only, i.e. any attempt to change their content leads to segementation fault.

    char amessage[] = "now is the time";
    

    is really a syntactic sugar for the following:

    char amessage[] = { 'n','o','w',' ','i','s',' ','t',
                        'h','e',' ','t','i','m','e','\0' };
    

    i.e. it creates an array of 16 characters and initialises its content with the string "now is the time" (together with the NULL terminator).

    On the other hand

    char *pmessage = "now is the time";
    

    puts the same string data somewhere in the read-only data and assigns its address to the pointer pmessage. It works similar to this:

    // This one is in the global scope so the array is not on the stack
    const char _some_unique_name[] = "now is the time";
    
    char *pmessage = _some_unique_name;
    

    _some_unique_name is chosen so as to not clash with any other identifier in your program. Usually symbols that are not permitted by the C language, but are ok for the assembler and the linker, are used (e.g. dots like in string.1634).

    You can change the value of a pointer - this will make it point something else, e.g. to another string. But you cannot change the address behind the name of an array, i.e. amessage will always refer to the same array storage that was allocated for it in first place.

    You can refer to individual elements of each string using amessage[i] or pmessage[i] but you can only assign to the elements of amessage as they are located in the read-write memory.

提交回复
热议问题