Find out the unicode script of a character

前端未结

关注

 5  925

既然无缘 2020-12-09 16:45

Given a unicode character what would be the simplest way to return its script (as \"Latin\", \"Hangul\" etc)? unicodedata doesn\'t seem to provide this kind of feature.

5条回答

借酒劲吻你 (楼主)

2020-12-09 17:32

The only way I know of is unfortunately to get the Unicode code point with ord() and then use your own table (by using http://en.wikipedia.org/wiki/Unicode#Standardized_subsets and more). A preliminary conversion to some normal form may be in order, so as to handle the fact that a single "written" character can be expressed with different sequences of code points (the unicodedata module helps, here).

0 讨论(0)

查看其它5个回答
发布评论:

提交评论
- 加载中...