Convert GBK to utf8 string in python

前端 未结 5 1861
面向向阳花
面向向阳花 2021-01-07 04:16

I have a string.

s = u\"

        
5条回答
  •  误落风尘
    2021-01-07 04:46

    You are mixing apples and oranges. The GBK-encoded string is not a Unicode string and should hence not end up in a u'...' string.

    This is the correct way to do it in Python 2.

    g = '\xc7\xeb\xca\xe4\xc8\xeb\xd5\xfd\xc8\xb7\xd1\xe9\xd6\xa4\xc2\xeb,' \
        '\xd0\xbb\xd0\xbb!'.decode('gbk')
    s = u""
    

    Notice how the initializer for g which is passed to .decode('gbk') is not represented as a Unicode string, but as a plain byte string.

    See also http://nedbatchelder.com/text/unipain.html

提交回复
热议问题