Python unicode codepoint to unicode character

问题

I'm trying to write out to a flat file some Chinese, or Russian or various non-English character-sets for testing purposes. I'm getting stuck on how to output a Unicode hex-decimal or decimal value to its corresponding character.

For example in Python, if you had a hard coded set of characters like абвгдежзийкл you would assign value = u"абвгдежзийкл" and no problem.

If however you had a single decimal or hex decimal like 1081 / 0439 stored in a variable and you wanted to print that out with it's corresponding actual character (and not just output 0x439) how would this be done? The Unicode decimal/hex value above refers to й.

回答1:

Python 2: Use unichr():

>>> print(unichr(1081))
й

Python 3: Use chr():

>>> print(chr(1081))
й

回答2:

So the answer to the question is:

convert the hexadecimal value to decimal with int(hex_value, 16)
then get the corresponding strin with chr().

To sum up:

>>> print(chr(int('0x897F', 16)))
西

回答3:

If you run into the error:

ValueError: unichr() arg not in range(0x10000) (narrow Python build)

While trying to convert your hex value using unichr, you can get around that error by doing something like:

>>> n = int('0001f600', 16)
>>> s = '\\U{:0>8X}'.format(n)
>>> s
'\\U0001F600'
>>> binary = s.decode('unicode-escape')
>>> print(binary)
😀

来源：https://stackoverflow.com/questions/10715669/python-unicode-codepoint-to-unicode-character

标签

python

encoding