Why do strings in C need to be null terminated?

前端 未结 9 1167
青春惊慌失措
青春惊慌失措 2020-11-29 04:53

Just wondering why this is the case. I\'m eager to know more about low level languages, and I\'m only into the basics of C and this is already confusing me.

Do langu

相关标签:
9条回答
  • 2020-11-29 05:25

    C strings are arrays of chars, and a C array is just a pointer to a memory location, which is the start location of the array. But also the length (or end) of the array must be expressed somehow; in case of strings, a null termination is used. Another alternative would be to somehow carry the length of the string alongside with the memory pointer, or to put the length in the first array location, or whatever. It's just a matter of convention.

    Higher level languages like Java or PHP store the size information with the array automatically & transparently, so the user needn't worry about them.

    0 讨论(0)
  • 2020-11-29 05:28

    From Joel's excellent article on the topic:

    Remember the way strings work in C: they consist of a bunch of bytes followed by a null character, which has the value 0. This has two obvious implications:

    There is no way to know where the string ends (that is, the string length) without moving through it, looking for the null character at the end. Your string can't have any zeros in it. So you can't store an arbitrary binary blob like a JPEG picture in a C string. Why do C strings work this way? It's because the PDP-7 microprocessor, on which UNIX and the C programming language were invented, had an ASCIZ string type. ASCIZ meant "ASCII with a Z (zero) at the end."

    Is this the only way to store strings? No, in fact, it's one of the worst ways to store strings. For non-trivial programs, APIs, operating systems, class libraries, you should avoid ASCIZ strings like the plague.

    0 讨论(0)
  • 2020-11-29 05:35

    In C strings are represented by an array of characters allocated in a contiguous block of memory and thus there must either be an indicator stating the end of the block (ie. the null character), or a way of storing the length (like Pascal strings which are prefixed by a length).

    In languages like PHP,Perl,C# etc.. strings may or may not have complex data structures so you cannot assume they have a null character. As a contrived example, you could have a language that represents a string like so:

    class string
    {
       int length;
       char[] data;
    }
    

    but you only see it as a regular string with no length field, as this can be calculated by the runtime environment of the language and is only used internally by it to allocate and access memory correctly.

    0 讨论(0)
提交回复
热议问题