In a previous answer I gave, I responded to the following warning being caused by the fact that \'\\u0B95\'
requires three bytes and so is a multicharacter
You are correct, according to the spec '\u0B95'
is a char-typed character literal with a value equal to the character's encoding in the execution character set. And you're right that the spec doesn't say anything about the case where this is not possible for char literals due to a single char being unable to represent that value. The behavior is undefined.
There are defect reports filed with the committee on this issue: E.g., http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_defects.html#912
The currently proposed resolution seems to be to specify that these character literals are also int
s and have implementation defined values (although the proposed language isn't quite right for that), just like multichar literals. I'm not a fan of that solution, and I think a better solution is to say such literals are ill-formed.
This is what's implemented in clang: http://coliru.stacked-crooked.com/a/952ce7775dcf7472