Python unicode codepoint to unicode character

做~自己de王妃 提交于 2019-12-28 13:23:16

问题


I'm trying to write out to a flat file some Chinese, or Russian or various non-English character-sets for testing purposes. I'm getting stuck on how to output a Unicode hex-decimal or decimal value to its corresponding character.

For example in Python, if you had a hard coded set of characters like абвгдежзийкл you would assign value = u"абвгдежзийкл" and no problem.

If however you had a single decimal or hex decimal like 1081 / 0439 stored in a variable and you wanted to print that out with it's corresponding actual character (and not just output 0x439) how would this be done? The Unicode decimal/hex value above refers to й.


回答1:


Python 2: Use unichr():

>>> print(unichr(1081))
й

Python 3: Use chr():

>>> print(chr(1081))
й



回答2:


So the answer to the question is:

  1. convert the hexadecimal value to decimal with int(hex_value, 16)
  2. then get the corresponding strin with chr().

To sum up:

>>> print(chr(int('0x897F', 16)))
西



回答3:


If you run into the error:

ValueError: unichr() arg not in range(0x10000) (narrow Python build)

While trying to convert your hex value using unichr, you can get around that error by doing something like:

>>> n = int('0001f600', 16)
>>> s = '\\U{:0>8X}'.format(n)
>>> s
'\\U0001F600'
>>> binary = s.decode('unicode-escape')
>>> print(binary)
😀


来源:https://stackoverflow.com/questions/10715669/python-unicode-codepoint-to-unicode-character

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!