If char*s are read only, why can I overwrite them?

我们两清 提交于 2019-12-22 01:56:06

问题


My course taught me that char*s are static/read only so I thought that would mean you can't edit them after you have defined them. But when I run:

char* fruit = "banana";
printf("fruit is %s\n", fruit);
fruit = "apple";
printf("fruit is %s\n", fruit);

Then it compiles fine and gives me:

fruit is banana
fruit is apple

Why? Have I misunderstood what it means to be read-only? Sorry if this is obvious but I'm new to coding and I can't find the answer online.


回答1:


The presented code snippet does not change the string literals themselves. It only changes the values stored in the pointer fruit.

You can imagine these lines

char* fruit = "banana";
fruit = "apple";

the following way

char unnamed_static_array_banana[] = { 'b', 'a', 'n', 'a', 'n', 'a', '\0' };
char *fruit = &unnamed_static_array_banana[0];
char unnamed_static_array_apple[]  = { 'a', 'p', 'p', 'l', 'e', '\0' };
fruit = &unnamed_static_array_apple[0];

These statements do not change the arrays that corresponds to the string literals.

On the other hand if you tried to write

char* fruit = "banana";
printf("fruit is %s\n", fruit);
fruit[0] = 'h';
^^^^^^^^^^^^^^
printf("fruit is %s\n", fruit);

that is if you tried to change a string literal using a pointer that points to it (to the first character of the string literal) then the program had undefined behavior.

From the C Standard (6.4.5 String literals)

7 It is unspecified whether these arrays are distinct provided their elements have the appropriate values. If the program attempts to modify such an array, the behavior is undefined.




回答2:


In your program, the expression "banana" denotes a string literal object in the program image, a character array. The value of the expression is of type char *, or "pointer to character". The pointer points to the first byte of that array, the character 'b'.

Your char *fruit variable also has type "pointer to character" and takes its initial value from this expression: it is initialized to a copy of the pointer to the data, not the data itself; it merely points to the b.

When you assign "apple" to fruit, you're just replacing its pointer value with another one, so it now points to a different literal array.

To modify the data itself, you need an expression such as:

char *fruit = "banana";
fruit[0] = 'z';  /* try to turn "banana" into "zanana" */

According to the ISO C standard, the behavior of this is not defined. It could be that the "banana" array is read-only, but that is not required.

C implementations can make string literals writable, or make it an option.

(If you are able to modify a string literal, that doesn't mean that all is well. Firstly, your program is still not well defined according to ISO C: it is not portable. Secondly, the C compiler is allowed to merge literals which have common content into the same storage. This means that two occurrences of "banana" in the program could in fact be exactly the same array. Furthermore, the string literal "nana" occurring somewhere in the program could be the suffix of the array "banana" occurring elsewhere; in other words, share the same storage. Modifying a literal can have surprising effects; the modification can appear in other literals.)

Also "static" and "read-only" aren't synonymous. Most static storage in C is in fact modifiable. We can create a modifiable static character array which holds a string like this:

/* at file scope, i.e. outside of any function */
char fruit[] = "banana";

Or:

{
  /* in a function */
  static fruit[] = "banana";

If we leave out the array size, it is automatically sized from the initializing string literal, and includes space for the null terminating byte. In the function, we need static to put the array into static storage, otherwise we get a local variable.

These arrays can be modified; fruit[0] = 'z' is well-defined behavior.

Also, in these situations, "banana" doesn't denote a character array. The array is the variable fruit; the "banana" expression is just a piece of syntax which indicates the array's initial value:

char *fruit = "banana";  // "banana" is an object in program image
                         // initial value is a pointer to that object

char fruit_array[] = "apple"; // "apple" is syntax giving initial value



回答3:


The fruit object is writable - it can be set to point to a different string literal.

The string literals "banana" and "apple" are not writable. You can modify fruit to point to a string literal, but if you do so then you should not attempt to modify the thing that fruit points to:

char *fruit = "banana"; // fruit points to first character of string literal
fruit = "apple";        // okay, fruit points to first character of different string literal
*fruit = 'A';           // not okay, attempting to modify contents of string literal
fruit[1] = 'P';         // not okay, attempting to modify contents of string literal

Attempting to modify the contents of a string literal results in undefined behavior - your code may work as expected, or you may get a runtime error, or something completely unexpected may happen. For safety's sake, if you're defining a variable to point to a string literal, you should declare it const:

const char *fruit = "banana";  // can also be written char const *

You can still assign fruit to point to different strings:

fruit = "apple";

but if you try to modify what fruit points to, the compiler will yell at you.

If you want to define a pointer that can only point to one specific string literal, then you can const-qualify the pointer as well:

const char * const fruit = "banana"; // can also be written char const * const

This way, if you try to either write to what fruit points to, or try to set fruit to point to a different object, the compiler will yell at you.




回答4:


Basically, when you perform

char* fruit = "banana";

You set up a pointer fruit to the first letter of "banana". When printing it out, C basically starts at the 'b' and keeps printing letters until it hits a \0 null character at the end.

By then saying

fruit = "apple";

You've changed the pointer fruit to now point to the first letter of "apple"




回答5:


First of all, char* aren't read-only. char * consts are. And they are different from char const *. And literal strings (e.g., "banana") should be, but aren't necessarily.

char * const  cpfruit = "banana";
cpfruit = "apple";        // error

char const * cpfruit = "banana";
cpfruit[0] = 'x';        // error

char * ncfruit = "banana";
ncfruit[0] = 'x';        // compile will allow, but may cause run-time error.



回答6:


You are pointing your variable fruit to a different string. You are only overwriting the address (location). The compiler will see your constant string "banana" and "apple" and store them separately in program memory. Let's say the string "banana" goes to memory cell located at address 1 and "apple" gets stored to memory addesss 2. Now when you do:

fruit = "banana";

the compiler will just assign 1 to variable fruit, which means it points to address 1 which containts the string banana. When you do:

fruit = "apple";

the compiler will assign 2 variable fruit, which means it points to addess 2 where the string apple is stored.




回答7:


What your course has taught you is correct!

When you defined char* fruit = "banana" in the first place you basically have fruit as a pointer to a constant character . The 7 bytes (including the null termination) of the string resides in the .ro section of the object file (section name would obviously vary depending on the platform).

When you reset the char pointer fruit to "apple" it just pointed to another memory location in the read only section which contains "apple"

Essentially when you say fruit is a constant it refers to fruit being a pointer to a const memory. If you would have defined it as a const pointer to a const string :-
char* const fruit = "banana";
The compiler would have stopped you from resetting it to "apple"




回答8:


When you use char *p="banana"; the string banana is stored in a read only memory location. Following which when you enter p="apple"; the string apple is stored in some other memory location and the pointer is now pointing to the new memory location.

You can confirm this by printing p just after each assignment.

#include<stdio.h>
int main(void)
{
    char *p = "Banana";
    printf("p contains address of string constant 'Banana' at 0x%p\n", p);

    p="Apple";
    printf("p contains address of string constant 'Apple' at 0x%p\n", p);

}


来源:https://stackoverflow.com/questions/44294649/if-chars-are-read-only-why-can-i-overwrite-them

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!