I want to get html content from a url and parse the html content with regular expression. But the html content has some multibyte characters. So I met the error described in
Try
open(file, 'r', encoding='utf-8')
instead of
open(file, 'r')