web scraping google news with python

前端未结

关注

 3  1359

渐次进展 2020-12-08 22:59

I am creating a web scraper for different news outlets, for Nytimes and the Guardian it was easy since they have their own API.

Now, I want to scrape results from th

3条回答

独厮守ぢ (楼主)

2020-12-08 23:23

You can use awesome requests library:

import requests

URL = 'https://www.google.com/search?pz=1&cf=all&ned=us&hl=en&tbm=nws&gl=us&as_q={query}&as_occt=any&as_drrb=b&as_mindate={month}%2F%{from_day}%2F{year}&as_maxdate={month}%2F{to_day}%2F{year}&tbs=cdr%3A1%2Ccd_min%3A3%2F1%2F13%2Ccd_max%3A3%2F2%2F13&as_nsrc=Gulf%20Times&authuser=0'


def run(**params):
    response = requests.get(URL.format(**params))
    print response.content, response.status_code


run(query="Egypt", month=3, from_day=2, to_day=2, year=13)

And you'll get status_code=200.

And, btw, take a look at scrapy project. Nothing makes web-scraping more simple than this tool.

0 讨论(0)

查看其它3个回答