Wrong number of results in Google Scrape with Python

前端未结

关注

 2  1190

情歌与酒 2021-01-03 16:48

I was trying to learn web scraping and I am facing a freaky issue... My task is to search Google for news on a topic in a certain date range and count the number of results.

2条回答

长情又很酷 (楼主)

2021-01-03 17:15

There are a couple of things that is causing this issue. First, it wants day and month parts of date in 2 digits and it is also expecting a user-agent string of some popular browser. Following code should work:

import requests,  bs4

headers = {
    "User-Agent":
        "Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/44.0.2403.157 Safari/537.36"
}
payload = {'as_epq': 'James Clark', 'tbs':'cdr:1,cd_min:01/01/2015,cd_max:01/01/2015', 'tbm':'nws'}
r = requests.get("https://www.google.com/search", params=payload, headers=headers)

soup = bs4.BeautifulSoup(r.content, 'html5lib')
print soup.find(id='resultStats').text

0 讨论(0)

查看其它2个回答