I have read Joel\'s article \"The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)\" but still don\'
The hint is in this sentence here:
In UTF-8, every code point from 0-127 is stored in a single byte. Only code points 128 and above are stored using 2, 3, in fact, up to 6 bytes.
Every code point up to 127 has the top bit set to zero. Therefore, the editor knows that if it encounters a byte where the top bit is a 1, it is the start of a multi-byte character.