What happened when we do not include '\0' at the end of string in C?

后端 未结 5 2000
醉话见心
醉话见心 2020-12-10 23:09

In C, when I initialize my array this way:

char full_name[] = {
    \'t\', \'o\', \'a\', \'n\'
};

and print it with printf(\"%s\", fu

相关标签:
5条回答
  • 2020-12-10 23:48

    Before passing the instruction pointer to a function expecting a c string you are implicitly entering a legally binding contract with that code block. In the primary section of this contract both parties agree to refrain from exchanging dedicated string length information and assert that all passed parameters declared as strings point to a sequence of characters terminated by \0 which gives each party the option to calculate the length.

    If you don't include a terminating \0 you will commit a fundamental breach of contract.

    The OS court will randomly sue your executable with madness or even death.

    0 讨论(0)
  • 2020-12-10 23:53

    If you use a non-null-terminated char sequence as a string, C functions will just keep going. It's the '\0' that tells them to stop. So, whatever happens to be in memory after the sequence will be taken as part of the string. This may eventually cross a memory boundary and cause an error, or it may just print gibberish if it happens to find a '\0' somewhere and stop.

    0 讨论(0)
  • 2020-12-11 00:03

    printf will interpret "%s" as a standard C string. This means that the code that is generated will simply keep reading characters until it finds a null terminator (\0).

    Often this will mean this wandering pointer will venture into uncharted memory and Valgrind will notice this as an error.

    You have to explicitly add your own null terminator when initialising a char array, if you intend to use it as a string at some point.

    0 讨论(0)
  • 2020-12-11 00:06

    Since %s format specifier expects a null-terminated string, the resulting behavior of your code is undefined. Your program is considered ill-formed, and can produce any output at all, produce no output, crash, and so on. To put this shortly, don't do that.

    This is not to say that all arrays of characters must be null-terminated: the rule applies only to arrays of characters intended to use as C strings, e.g. to be passed to printf on %s format specifier, or to be passed to strlen or other string functions of the Standard C library.

    If you are intended to use your char array for something else, it does not need to be null terminated. For example, this use is fully defined:

    char full_name[] = {
        't', 'o', 'a', 'n'
    };
    for (size_t i = 0 ; i != sizeof(full_name) ; i++) {
        printf("%c", full_name[i]);
    }
    
    0 讨论(0)
  • 2020-12-11 00:09

    If you do not provide the '\0' at the end for the comma separated brace enclosed initializer list, technically, full_name is not a string, as the char array is not null-terminated.

    Just to clear things out a bit, unlike the initializer being string literal, a comma separated list does not automatically count and put the terminating null character into the array.

    So, in case of a definition like

    char full_name[] = {
        't', 'o', 'a', 'n'
    };
    

    the size of the array is 4 and it has 't', 'o', 'a', 'n' into it.

    OTOH, in case of

    char full_name[] = "toan";
    

    full_name will be of size 5 and will contain 't', 'o', 'a', 'n' and '\0'into it.

    When you try to make use of the former array with any function operating on strings (i.e., expects a null-terminated char array), you'll get undefined behavior as most of the string functions will go out of bound in search for the null-terminator.

    In your particular example, for %s format specifier with printf(), quoting the C11 standard, chapter §7.21.6.1, fprintf() function description (emphasis mine)

    s
    If no l length modifier is present, the argument shall be a pointer to the initial element of an array of character type.280) Characters from the array are written up to (but not including) the terminating null character. If the precision is specified, no more than that many bytes are written. If the precision is not specified or is greater than the size of the array, the array shall contain a null character.

    That means, the printf() will look for a null-terminator to mark/understand the end of the array. In your example, the lack of the null-terminator will cause printf() to go beyond the allocated memory (full_name[3]) and access out-of-bound memory (full_name[4]) which will cause the UB.

    0 讨论(0)
提交回复
热议问题