Find out the unicode script of a character

前端 未结 5 933
既然无缘
既然无缘 2020-12-09 16:45

Given a unicode character what would be the simplest way to return its script (as \"Latin\", \"Hangul\" etc)? unicodedata doesn\'t seem to provide this kind of feature.

5条回答
  •  心在旅途
    2020-12-09 17:23

    You can use ord to retrieve the numeric value of a character (it works on both unicode and byte strings of length 1).

    The next step, unfortunately, will involve you then testing against the ranges. Possibly the data here will be of assistance: http://cldr.unicode.org/index/downloads

提交回复
热议问题