extract the number of results from google search

我的未来我决定 提交于 2021-01-29 13:56:48

问题


I am writing a web scraper to extract the number of results of searching in a google search which appears on the top left of the page of search results. I have written the code below but I do not understand why phrase_extract is None. I want to extract the phrase "About 12,010,000,000 results". which part I am making a mistake? may be parsing the HTML incorrectly?

import requests
from bs4 import BeautifulSoup

def pyGoogleSearch(word):   
    address='http://www.google.com/#q='
    newword=address+word
    #webbrowser.open(newword)
    page=requests.get(newword)
    soup = BeautifulSoup(page.content, 'html.parser')
    phrase_extract=soup.find(id="resultStats")
    print(phrase_extract)

pyGoogleSearch('world')

example


回答1:


You're actually using the wrong url to query google's search engine. You should be using http://www.google.com/search?q=<query>.

So it'd look like this:

def pyGoogleSearch(word):
    address = 'http://www.google.com/search?q='
    newword = address + word
    page = requests.get(newword)
    soup = BeautifulSoup(page.content, 'html.parser')
    phrase_extract = soup.find(id="resultStats")
    print(phrase_extract)

You also probably just want the text of that element, not the element itself, so you can do something like

phrase_text = phrase_extract.text

or to get the actual value as an integer:

val = int(phrase_extract.text.split(' ')[1].replace(',',''))


来源:https://stackoverflow.com/questions/53177265/extract-the-number-of-results-from-google-search

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!