Why do I need char[k + 1] instead of char[k] for a string with length k? [closed]

问题

So I have a simple set of code:

#include <stdio.h>

int main()
{
  char x[3] = "ABC"; // (*)
  puts(x);
  return 0;
}

It returns a strange output:

ABC¬ a

Using the top answer from this question, I found that when I change x[3] to x[4] everything runs fine.

But why? Why do I get a strange output on x[3], and why is x[4] fine?

回答1:

Since you've asked "why" this yields ABC -a, here's an explanation: your char x[3] = "ABC" isn't well suited for puts. puts expects a string terminated by zero. However, your x is basically:

char x[3] = {'A', 'B', 'C'};

As you know, there's no way to get the length of a (dynamic) array:

char * allocate(){
   return malloc(rand() + 1);
}

char * mem = allocate(); // how large is mem??

There's no way for you to know how long it is. However, to print a string which is nothing else than a continuous sequence of characters in memory, a function needs to know when the string (aka the character sequence) ends.

That's why the American Standard Code for Information Interchange (ASCII) and many other character sets contain the null character. It's basically char with value 0:

char wrong_abc[3]   = {'A', 'B', 'C'};     // when does it end?
char correct_abc[4] = {'A', 'B', 'C', 0 }; // oh, there's a zero!

Now functions like puts can simply check for 0:

// Simplified, actual "puts" checks for errors and returns
// EOF on error or a non-negative int on succes.
void puts(const char * str){
   int i = 0;

   while(str[i] != 0){
      putchar(str[i]);
      i++;
   }

   putchar('\n');
}

And that's why you

need memory for all characters in the character sequence +1,
get undefined behaviour when you forget the 0.

The implementation of puts above would never find 0 and accidentally leave the memory you own (or access other data), which usually leads to a segfault or other errors (or worse, doesn't get detected for a long time and then yields critical errors). The actual behaviour in such a situation is undefined.

Note that string literals (e.g. "ABC") automatically have a '\0' at the end. Also, the compiler is smart enough to figure the length of the literal for you, so you can simply use

char x[] = "ABC";

That way, you don't have to worry if you change the literal later.

回答2:

There is no space for the terminating \0. In fact I would expect compilation to fail in such case.

Try

char x[4] = "ABC";

Or, as @Zeta suggested, just

char x[] = "ABC";

回答3:

A string is terminated by the null character - you have not allocated space for it.

char x[] = "ABC";

and let the compiler do the work for you!

回答4:

Your string should be terminated with \0.

Use

char x[] = "ABC";

instead.

Using the top answer from this question, I found that when I change x[3] to x[4] everything runs fine.

BUT WHY? What is going on that x[3] is giving such a strange output?

puts() keeps going until it encounters a terminating \0 byte. So if you don't supply one at the end of your string, it keeps running behind your string until it finds a \0 or crashes..

See puts() description:

The function begins copying from the address specified (str) until it reaches the terminating null character ('\0'). This terminating null-character is not copied to the stream.

回答5:

Every time you declare a string variable, just add one more character to those you need for the ending character '\0' and you 'll be fine.

来源：https://stackoverflow.com/questions/34519705/why-do-i-need-chark-1-instead-of-chark-for-a-string-with-length-k

标签

output