Python unicode character conversion for Emoji

有些话、适合烂在心里 提交于 2019-12-02 00:57:02

问题


I'm having some issues with formatting a byte ordered mark to unicode. There is some oddness coming in with how my character is being expressed. Basically it's not printing an emoji character in Python, instead it's just the string. Here's my example.

# these codes are coming from a json file; this a representation of one of the codes.
e = 'U+1F600' # smile grin emoji

# not sure how to clean this, so here's a basic attempt using regex.
b = re.compile(r'U\+', re.DOTALL).sub('\U000', e)

print unicode(b) # output should be '\U0001F600'

For whatever reason this doesn't print an emoji character.

However if you type out the same string as a literal, using the u flag everything works as expected.

print u'\U0001F600'

What am I doing wrong here? I thought that the unicode function would convert my string to the working equivalent, but it apparently is not.

I'm using Python 2.7


回答1:


I guess decode is what you are looking for,

>>> b = '\U0001F600'
>>> print b.decode('unicode-escape')
😀

or

>>> print unicode(b, 'unicode-escape')
😀

The issue with

print unicode(b)

is that the unicode function tries to convert the string \U0001F600 to unicode which results in \\U0001F600. To prevent this we provide the current encoding as unicode-escape



来源:https://stackoverflow.com/questions/41604811/python-unicode-character-conversion-for-emoji

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!