发表新帖

发表新帖

Truncating unicode so it fits a maximum size when encoded for wire transfer

前端未结

关注

 5  1537

借酒劲吻你 2020-12-29 21:57

Given a Unicode string and these requirements:

The string be encoded into some byte-sequence format (e.g. UTF-8 or JSON unicode escape)
The encoded st

5条回答

一个人的身影 (楼主)

2020-12-29 22:13
Check the last character of the string. If high bit set, then it is not the last byte in a UTF-8 character, so back up and try again until you find one that is.
```
mxlen=255        
while( toolong.encode("utf8")[mxlen-1] & 0xc0 == 0xc0 ):
    mxlen -= 1

truncated_string = toolong.encode("utf8")[0:mxlen].decode("utf8")
```
0 讨论(0)

查看其它5个回答
发布评论:

提交评论
- 加载中...

热议问题