Efficient way to calculate byte length of a character, depending on the encoding

后端未结

关注

 4  950

予麋鹿 2021-02-06 00:48

What\'s the most efficient way to calculate the byte length of a character, taking the character encoding into account? The encoding would be only known during runtime. In UTF-8

4条回答

轻奢々 (楼主)

2021-02-06 01:18

It is possible that an encoding scheme could encode a given character as a variable number of bytes, depending on what comes before and after it in the character sequence. The byte length you get from encoding a single character String is therefore not the whole answer.

(For example, you could theoretically receive a baudot / teletype characters encoded as 4 characters every 3 bytes, or you could theoretically treat a UTF-16 + a stream compressor as an encoding scheme. Yes, it is all a bit implausible, but ...)

0 讨论(0)

查看其它4个回答
发布评论:

提交评论
- 加载中...