Getting ‘wrong’ page source when calling url from python

后端 未结 3 751
攒了一身酷
攒了一身酷 2020-12-18 00:06

Trying to retrieve the page source from a website, I get a completely different (and shorter) text than when viewing the same page source through a web browser.

htt

3条回答
  •  猫巷女王i
    2020-12-18 00:19

    Below is one way of getting around this issue. First time you run the script you might have to type in the captcha in the window opened by the webdriver but after that you should be good to go. You can then use beautifulsoup to navigate the response.

    from selenium import webdriver
    
    def get_page_source(n):
    
        wd = webdriver.Chrome("/Users/karlanka/Downloads/Chromedriver")
        url = 'https://www.whoscored.com/Matches/' + str(n) + '/live'
    
        wd.get(url)
    
        html_page = wd.page_source
        wd.quit()
    

提交回复
热议问题