extract the number of results from google search

问题

I am writing a web scraper to extract the number of results of searching in a google search which appears on the top left of the page of search results. I have written the code below but I do not understand why phrase_extract is None. I want to extract the phrase "About 12,010,000,000 results". which part I am making a mistake? may be parsing the HTML incorrectly?

import requests
from bs4 import BeautifulSoup

def pyGoogleSearch(word):   
    address='http://www.google.com/#q='
    newword=address+word
    #webbrowser.open(newword)
    page=requests.get(newword)
    soup = BeautifulSoup(page.content, 'html.parser')
    phrase_extract=soup.find(id="resultStats")
    print(phrase_extract)

pyGoogleSearch('world')

example

回答1:

You're actually using the wrong url to query google's search engine. You should be using http://www.google.com/search?q=<query>.

So it'd look like this:

def pyGoogleSearch(word):
    address = 'http://www.google.com/search?q='
    newword = address + word
    page = requests.get(newword)
    soup = BeautifulSoup(page.content, 'html.parser')
    phrase_extract = soup.find(id="resultStats")
    print(phrase_extract)

You also probably just want the text of that element, not the element itself, so you can do something like

phrase_text = phrase_extract.text

or to get the actual value as an integer:

val = int(phrase_extract.text.split(' ')[1].replace(',',''))

来源：https://stackoverflow.com/questions/53177265/extract-the-number-of-results-from-google-search

标签

python

web-scraping

beautifulsoup