python requests.get() returns improperly decoded text instead of UTF-8?

前端 未结 4 2088
野性不改
野性不改 2020-11-29 22:57

When the content-type of the server is \'Content-Type:text/html\', requests.get() returns improperly encoded data.

However, if

4条回答
  •  时光取名叫无心
    2020-11-29 23:46

    From requests documentation:

    When you make a request, Requests makes educated guesses about the encoding of the response based on the HTTP headers. The text encoding guessed by Requests is used when you access r.text. You can find out what encoding Requests is using, and change it, using the r.encoding property.

    >>> r.encoding
    'utf-8'
    >>> r.encoding = 'ISO-8859-1'
    

    Check the encoding requests used for your page, and if it's not the right one - try to force it to be the one you need.

    Regarding the differences between requests and urllib.urlopen - they probably use different ways to guess the encoding. Thats all.

提交回复
热议问题