How does UTF-8 “variable-width encoding” work?
问题 The unicode standard has enough code-points in it that you need 4 bytes to store them all. That\'s what the UTF-32 encoding does. Yet the UTF-8 encoding somehow squeezes these into much smaller spaces by using something called \"variable-width encoding\". In fact, it manages to represent the first 127 characters of US-ASCII in just one byte which looks exactly like real ASCII, so you can interpret lots of ascii text as if it were UTF-8 without doing anything to it. Neat trick. So how does it