Is there a unicode-ready substitute I can use for urllib.quote and urllib.unquote in Python 2.6.5?

前端 未结 4 767
清歌不尽
清歌不尽 2020-12-05 13:18

Python\'s urllib.quote and urllib.unquote do not handle Unicode correctly in Python 2.6.5. This is what happens:

In [5]: print urll         


        
4条回答
  •  野趣味
    野趣味 (楼主)
    2020-12-05 13:27

    I encountered the same problem and used a helper function to deal with non-ascii and urllib.urlencode function (which includes quote and unquote):

    def utf8_urlencode(params):
        import urllib as u
        # problem: u.urlencode(params.items()) is not unicode-safe. Must encode all params strings as utf8 first.
        # UTF-8 encodes all the keys and values in params dictionary
        for k,v in params.items():
            # TRY urllib.unquote_plus(artist.encode('utf-8')).decode('utf-8')
            if type(v) in (int, long, float):
                params[k] = v
            else:
                try:
                    params[k.encode('utf-8')] = v.encode('utf-8')
                except Exception as e:
                    logging.warning( '**ERROR utf8_urlencode ERROR** %s' % e )
        return u.urlencode(params.items()).decode('utf-8')
    

    adopted from Unicode URL encode / decode with Python

提交回复
热议问题