Efficient way to calculate byte length of a character, depending on the encoding

后端 未结 4 929
予麋鹿
予麋鹿 2021-02-06 00:48

What\'s the most efficient way to calculate the byte length of a character, taking the character encoding into account? The encoding would be only known during runtime. In UTF-8

4条回答
  •  轻奢々
    轻奢々 (楼主)
    2021-02-06 01:18

    It is possible that an encoding scheme could encode a given character as a variable number of bytes, depending on what comes before and after it in the character sequence. The byte length you get from encoding a single character String is therefore not the whole answer.

    (For example, you could theoretically receive a baudot / teletype characters encoded as 4 characters every 3 bytes, or you could theoretically treat a UTF-16 + a stream compressor as an encoding scheme. Yes, it is all a bit implausible, but ...)

提交回复
热议问题