I have read Joel\'s article \"The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)\" but still don\'
Essentially, if it begins with a 0, it's a 7 bit code point. If it begins with 10, it's a continuation of a multi-byte codepoint. Otherwise, the number of 1's tell you how many bytes this code point is encoded as.
The first byte indicates how many bytes encode the code point.
0xxxxxxx 7 bits of code point encoded in 1 bytes
110xxxxx 10xxxxxx 10 bits of code point encoded in 2 bytes
110xxxxx 10xxxxxx 10xxxxxx etc. 1110xxxx 11110xxx etc.