Handle UTF-8 string

前端 未结 5 1501
慢半拍i
慢半拍i 2021-01-13 06:30

as I know linux uses UTF-8 encoding. This means I can use std::string for handling string right? Just the encoding will be UTF-8.

Now on UTF-8 we know s

5条回答
  •  醉酒成梦
    2021-01-13 07:10

    There are multiple concepts here:

    1. length of UTF-8 encoding in bytes
    2. number of Unicode code points used (= number of UTF-8 bytes outside the 0x80..0xbf range)
    3. number of glyphs ("characters" in Western languages)
    4. screen space occupied when displaying

    Normally, you are only interested in 1. (for memory requirements) and 4. (for display), the others have no real application.

    The amount of screen space can be queried from the rendering context. Note that this may change depending on context (for example, Arabic letters change shape at the beginning and end of words), so if you are doing text input, you may need to perform additional trickery to give users a consistent experience.

提交回复
热议问题