I have a confusion of how the compiler handles a char variable with multiple characters. I understand that a char is 1 byte and it can contain one character like ASCII.
[lex.ccon]/1:
An ordinary character literal that contains more than one c-char is a multicharacter literal. A multicharacter literal [..] is conditionally-supported, has type
int
, and has an implementation-defined value.
Why does the compiler always takes the last character when multiple characters are used? What is the compiler mechanism in this situation.
Most compilers just shift the character values together in order: That way the last character occupies the least significant byte, the penultimate character occupies the byte next to the least significant one, and so forth.
I.e. 'abc'
would be equivalent to 'c' + ((int)'b')<<8) + (((int)'a')<<16)
(Demo).
Converting this int
back to a char
will have an implementation defined value - that might just emerge from taking the value of the int
modulo 256. That would simply give you the last character.
Why did I get a too many characters error when I put 5 characters. 2 characters is more than what a char can handle so why 5?
Because on your machine an int
is probably four bytes large. If the above is indeed the way your compiler arranges multicharacter constants in, he cannot put five char
values into an int
.