For loop doesn't work for web scraping Google search in python

混江龙づ霸主 提交于 2021-02-05 12:21:26

问题


I'm working on web-scraping Google search with a list of keywords. The nested For loop for scraping a single page works well. However, the other for loop searching keywords in the list does not work as I intended to which scrap the data for each searching result. The results didn't get the search outcome of the first two keywords, but it got only the result of the last keyword.

Here is the code:

browser = webdriver.Chrome(r"C:\...\chromedriver.exe")

df = pd.DataFrame(columns = ['ceo', 'value'])

baseUrl = 'https://www.google.com/search?q='
ceo_list = ["Bill Gates", "Elon Musk", "Warren Buffet"]
values =[]


for ceo in ceo_list:
    browser.get(baseUrl + ceo)
    table = browser.find_elements_by_css_selector('div.ifM9O') 

    for row in table:
        ceo = str(([c.text for c in row.find_elements_by_css_selector('div.kno-ecr-pt.PZPZlf.gsmt.i8lZMc')])).strip('[]').strip("''")
        value = str(([c.text for c in row.find_elements_by_css_selector('div.Z1hOCe')])).strip('[]').strip("''")

    ceo = pd.Series(ceo) 
    value = pd.Series(value)

    df = df.assign(**{'ceo': ceo, 'value': value}) 


print(df)

browser.close()

This is the output:

              ceo                                              value
0  Warren Buffett  Born: August 30, 1930 (age 89 years), Omaha, N...

What I'm expecting is this:

              ceo                                              value
0  Bill Gates      Born:..........
1  Elon Musk       Born:...........
2  Warren Buffett  Born: August 30, 1930 (age 89 years), Omaha, N...

Not sure which part was missing.


回答1:


You need to create ceo as a list and append to it inside the for loop so you don't keep overwriting it



来源:https://stackoverflow.com/questions/60643795/for-loop-doesnt-work-for-web-scraping-google-search-in-python

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!