How to unquote a urlencoded unicode string in python?

后端 未结 5 1606
时光说笑
时光说笑 2020-11-29 00:23

I have a unicode string like \"Tanım\" which is encoded as \"Tan%u0131m\" somehow. How can i convert this encoded string back to original unicode. Apparently urllib.unquote

5条回答
  •  执念已碎
    2020-11-29 00:44

    there is a bug in the above version where it freaks out sometimes when there are both ascii encoded and unicode encoded characters in the string. I think its specifically when there are characters from the upper 128 range like '\xab' in addition to unicode.

    eg. "%5B%AB%u03E1%BB%5D" causes this error.

    I found if you just did the unicode ones first, the problem went away:

    def unquote_u(source):
      result = source
      if '%u' in result:
        result = result.replace('%u','\\u').decode('unicode_escape')
      result = unquote(result)
      return result
    

提交回复
热议问题