问题
I was web-scraping weather-searched Google with bs4, and Python can't find a <span> tag when there is one. How can I solve this problem?
I tried to find this <span> with the class and the id, but both failed.
<div id="wob_dcp">
<span class="vk_gy vk_sh" id="wob_dc">Clear with periodic clouds</span>
</div>
Above is the HTML code I was trying to scrape in the page:
Sorry, I can't post images because of the reputation^^;
response = requests.get('https://www.google.com/search?hl=ja&ei=coGHXPWEIouUr7wPo9ixoAg&q=%EC%9D%BC%EB%B3%B8+%E6%A1%9C%E5%B7%9D%E5%B8%82%E7%9C%9F%E5%A3%81%E7%94%BA%E5%8F%A4%E5%9F%8E+%EB%82%B4%EC%9D%BC+%EB%82%A0%EC%94%A8&oq=%EC%9D%BC%EB%B3%B8+%E6%A1%9C%E5%B7%9D%E5%B8%82%E7%9C%9F%E5%A3%81%E7%94%BA%E5%8F%A4%E5%9F%8E+%EB%82%B4%EC%9D%BC+%EB%82%A0%EC%94%A8&gs_l=psy-ab.3...232674.234409..234575...0.0..0.251.929.0j6j1......0....1..gws-wiz.......35i39.yu0YE6lnCms')
soup = BeautifulSoup(response.content, 'html.parser')
tomorrow_weather = soup.find('span', {'id': 'wob_dc'}).text
But failed with this code, the error is:
Traceback (most recent call last):
File "C:\Users\sungn_000\Desktop\weather.py", line 23, in <module>
tomorrow_weather = soup.find('span', {'id': 'wob_dc'}).text
AttributeError: 'NoneType' object has no attribute 'text'
Please solve this error.
回答1:
This is because the weather section is rendered by the browser via JavaScript. So when you use requests you only get the HTML content of the page which doesn't have what you need.
You should use for example selenium (or requests-html) if you want to parse page with elements rendered by web browser.
from bs4 import BeautifulSoup
from requests_html import HTMLSession
session = HTMLSession()
response = session.get('https://www.google.com/search?hl=en&ei=coGHXPWEIouUr7wPo9ixoAg&q=%EC%9D%BC%EB%B3%B8%20%E6%A1%9C%E5%B7%9D%E5%B8%82%E7%9C%9F%E5%A3%81%E7%94%BA%E5%8F%A4%E5%9F%8E%20%EB%82%B4%EC%9D%BC%20%EB%82%A0%EC%94%A8&oq=%EC%9D%BC%EB%B3%B8%20%E6%A1%9C%E5%B7%9D%E5%B8%82%E7%9C%9F%E5%A3%81%E7%94%BA%E5%8F%A4%E5%9F%8E%20%EB%82%B4%EC%9D%BC%20%EB%82%A0%EC%94%A8&gs_l=psy-ab.3...232674.234409..234575...0.0..0.251.929.0j6j1......0....1..gws-wiz.......35i39.yu0YE6lnCms')
soup = BeautifulSoup(response.content, 'html.parser')
tomorrow_weather = soup.find('span', {'id': 'wob_dc'}).text
print(tomorrow_weather)
Output:
pawel@pawel-XPS-15-9570:~$ python test.py
Clear with periodic clouds
回答2:
>>> from bs4 import BeautifulSoup
>>> soup = BeautifulSoup(a)
>>> a
'<div id="wob_dcp">\n <span class="vk_gy vk_sh" id="wob_dc">Clear with periodic clouds</span> \n</div>'
>>> soup.find("span", id="wob_dc").text
'Clear with periodic clouds'
Try this out.
来源:https://stackoverflow.com/questions/55351871/beautifulsoup-attributeerror-nonetype-object-has-no-attribute-text