问题
I'm extracting content from this url.
import requests
from bs4 import BeautifulSoup
url = 'https://www.collinsdictionary.com/dictionary/french-english/aimer'
headers = {'User-Agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:78.0) Gecko/20100101 Firefox/78.0'}
soup = BeautifulSoup(requests.get(url, headers = headers).content, 'html.parser')
for script in soup.select('script, .hcdcrt, #ad_contentslot_1, #ad_contentslot_2'):
script.extract()
entry_name = soup.h2.text
content1 = ''.join(map(str, soup.select_one('.cB cB-def dictionary biling').contents))
Then I got an error
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-84-e9cb11cd6b5d> in <module>
10
11 entry_name = soup.h2.text
---> 12 content1 = ''.join(map(str, soup.select_one('.cB cB-def dictionary biling').contents))
AttributeError: 'NoneType' object has no attribute 'contents'
On the other hand, if I replace cB cB-def dictionary biling by hom, i.e. content1 = ''.join(map(str, soup.select_one('.hom').contents)) then the code runs well. From below structure of the html, I think that cB cB-def dictionary biling and hom are very similar.
Could you please elaborate on how such problem arises and how to solve it?
回答1:
try this:
import requests
from bs4 import BeautifulSoup
url = 'https://www.collinsdictionary.com/dictionary/french-english/aimer'
headers = {'User-Agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:78.0) Gecko/20100101 Firefox/78.0'}
soup = BeautifulSoup(requests.get(url, headers = headers).content, 'html.parser')
for script in soup.select('script, .hcdcrt, #ad_contentslot_1, #ad_contentslot_2'):
script.extract()
entry_name = soup.h2.text
content1 = ''.join(map(str, soup.select_one('.cB.cB-def.dictionary.biling').contents))
When you select classes and it is blank-spaces in it you replace the space with ..
cB, cB-def, dictionary and biling is four different classes. And if you let the spaces be there the script looking for a tag with class cB-def inside of a tag with class cB and so on....
来源:https://stackoverflow.com/questions/63112871/why-does-error-nonetype-object-has-no-attribute-contents-occur-with-only-one