Python: how to convert utf-8 code string back to string?

后端 未结 3 2136
遥遥无期
遥遥无期 2020-12-15 01:52

I am using Python and unfortunately my code needs to convert a string that represents the utf-8 code of a string in to the original string, like:

UTF-8 code string t

3条回答
  •  没有蜡笔的小新
    2020-12-15 02:21

    If I understand the question, we have a simple byte string, having Unicode escaping in it, or something like that:

    a = '\u6b22\u8fce\u63d0\u4ea4\u5fae\u535a\u641c\u7d22\u4f7f\u7528\u53cd\u9988\uff0c\u8bf7\u76f4\u63a5'
    
    In [122]: a
    Out[122]: '\\u6b22\\u8fce\\u63d0\\u4ea4\\u5fae\\u535a\\u641c\\u7d22\\u4f7f\\u7528\\u53cd\\u9988\\uff0c\\u8bf7\\u76f4\\u63a5'
    

    So we need to manually parse the unicode values from the string, using the Unicode code points:

    \u6b22 => unichr(0x6b22) # 欢
    

    or finally:

    print "".join([unichr(int('0x'+a[i+2:i+6], 16)) for i in range(0, len(a), 6)])
    欢迎提交微博搜索使用反馈,请直接
    

提交回复
热议问题