How do I treat an ASCII string as unicode and unescape the escaped characters in it in python?

后端 未结 5 938
生来不讨喜
生来不讨喜 2020-11-30 03:13

For example, if I have a unicode string, I can encode it as an ASCII string like so:

>>> u\'\\u003cfoo/\\u003e\'.encode(\'ascii\')         


        
5条回答
  •  情话喂你
    2020-11-30 03:47

    At some point you will run into issues when you encounter special characters like Chinese characters or emoticons in a string you want to decode i.e. errors that look like this:

    UnicodeEncodeError: 'ascii' codec can't encode characters in position 109-123: ordinal not in range(128)
    

    For my case (twitter data processing), I decoded as follows to allow me to see all characters with no errors

    >>> s = '\u003cfoo\u003e'
    >>> s.decode( 'unicode-escape' ).encode( 'utf-8' )
    >>> 
    

提交回复
热议问题