I want to get html content from a url and parse the html content with regular expression. But the html content has some multibyte characters. So I met the error described in
Combining the above answers, I found the following code works very well.
import requests r = requests.get("https://www.example.com/").content str_content = r.decode('utf-8') fp = open("contents.txt","w", encoding='utf-8') fp.write(str_content) fp.close()