How to determine these elements of html?
问题 In this answer, @Andrej Kesely use the following code to remove unnecessary elements (ads, huge space,...) from html of this url. import requests from bs4 import BeautifulSoup url = 'https://www.collinsdictionary.com/dictionary/french-english/aimer' headers = {'User-Agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:78.0) Gecko/20100101 Firefox/78.0'} soup = BeautifulSoup(requests.get(url, headers=headers).content, 'html.parser') for script in soup.select('script, .hcdcrt, #ad_contentslot_1,