I want to make search engine and I follow tutorial in some web. I want to test parse html
from bs4 import BeautifulSoup
def parse_html(filename):
\"\"
In Python 3, files are opened as text (decoded to Unicode) for you; you don't need to tell BeautifulSoup what codec to decode from.
If decoding of the data fails, that's because you didn't tell the open()
call what codec to use when reading the file; add the correct codec with an encoding
argument:
with open(filename, encoding='utf8') as infile:
html = BeautifulSoup(infile, "html.parser")
otherwise the file will be opened with your system default codec, which is OS dependent.