google search with python requests library

前端 未结 3 1669
被撕碎了的回忆
被撕碎了的回忆 2020-11-29 07:47

(I\'ve tried looking but all of the other answers seem to be using urllib2)

I\'ve just started trying to use requests, but I\'m still not very clear on how to send o

相关标签:
3条回答
  • 2020-11-29 08:10

    input:

    import requests
    
    def googleSearch(query):
        with requests.session() as c:
            url = 'https://www.google.co.in'
            query = {'q': query}
            urllink = requests.get(url, params=query)
            print urllink.url
    
    googleSearch('Linkin Park')
    

    output:

    https://www.google.co.in/?q=Linkin+Park
    
    0 讨论(0)
  • 2020-11-29 08:20

    Request Overview

    The Google search request is a standard HTTP GET command. It includes a collection of parameters relevant to your queries. These parameters are included in the request URL as name=value pairs separated by ampersand (&) characters. Parameters include data like the search query and a unique CSE ID (cx) that identifies the CSE that is making the HTTP request. The WebSearch or Image Search service returns XML results in response to your HTTP requests.

    First, you must get your CSE ID (cx parameter) at Control Panel of Custom Search Engine

    Then, See the official Google Developers site for Custom Search.

    There are many examples like this:

    http://www.google.com/search?
      start=0
      &num=10
      &q=red+sox
      &cr=countryCA
      &lr=lang_fr
      &client=google-csbe
      &output=xml_no_dtd
      &cx=00255077836266642015:u-scht7a-8i
    

    And there are explained the list of parameters that you can use.

    0 讨论(0)
  • 2020-11-29 08:33
    import requests 
    from bs4 import BeautifulSoup
    
    headers_Get = {
            'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64; rv:49.0) Gecko/20100101 Firefox/49.0',
            'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
            'Accept-Language': 'en-US,en;q=0.5',
            'Accept-Encoding': 'gzip, deflate',
            'DNT': '1',
            'Connection': 'keep-alive',
            'Upgrade-Insecure-Requests': '1'
        }
    
    
    def google(q):
        s = requests.Session()
        q = '+'.join(q.split())
        url = 'https://www.google.com/search?q=' + q + '&ie=utf-8&oe=utf-8'
        r = s.get(url, headers=headers_Get)
    
        soup = BeautifulSoup(r.text, "html.parser")
        output = []
        for searchWrapper in soup.find_all('h3', {'class':'r'}): #this line may change in future based on google's web page structure
            url = searchWrapper.find('a')["href"] 
            text = searchWrapper.find('a').text.strip()
            result = {'text': text, 'url': url}
            output.append(result)
    
        return output
    

    Will return an array of google results in {'text': text, 'url': url} format. Top result url would be google('search query')[0]['url']

    0 讨论(0)
提交回复
热议问题