Unescape Python Strings From HTTP

僤鯓⒐⒋嵵緔 提交于 2019-11-26 08:28:26

问题


I\'ve got a string from an HTTP header, but it\'s been escaped.. what function can I use to unescape it?

myemail%40gmail.com -> myemail@gmail.com

Would urllib.unquote() be the way to go?


回答1:


I am pretty sure that urllib's unquote is the common way of doing this.

>>> import urllib
>>> urllib.unquote("myemail%40gmail.com")
'myemail@gmail.com'

There's also unquote_plus:

Like unquote(), but also replaces plus signs by spaces, as required for unquoting HTML form values.




回答2:


Yes, it appears that urllib.unquote() accomplishes that task. (I tested it against your example on codepad.)




回答3:


In Python 3, these functions are urllib.parse.unquote and urllib.parse.unquote_plus.

The latter is used for example for query strings in the HTTP URLs, where the space characters () are traditionally encoded as plus character (+), and the + is percent-encoded to %2B.

In addition to these there is the unquote_to_bytes that converts the given encoded string to bytes, which can be used when the encoding is not known or the encoded data is binary data. However there is no unquote_plus_to_bytes, if you need it, you can do:

def unquote_plus_to_bytes(s):
    if isinstance(s, bytes):
        s = s.replace(b'+', b' ')
    else:
        s = s.replace('+', ' ')
    return unquote_to_bytes(s)

More information on whether to use unquote or unquote_plus is available at URL encoding the space character: + or %20.



来源:https://stackoverflow.com/questions/780334/unescape-python-strings-from-http

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!