Python 3: using requests does not get the full content of a web page

后端 未结 2 1396
甜味超标
甜味超标 2020-12-14 12:27

I am testing using the requests module to get the content of a webpage. But when I look at the content I see that it does not get the full content of the page.<

相关标签:
2条回答
  • 2020-12-14 12:46

    Request is different from getting page source or visual elements of the web page, also viewing source from web page doesn't give you full access to everything that is on the web page including database requests and other back-end stuff. Either your question is not clear enough or you've misinterpreted how web browsing works.

    0 讨论(0)
  • 2020-12-14 12:54

    The page is rendered with JavaScript making more requests to fetch additional data. You can fetch the complete page with selenium.

    from bs4 import BeautifulSoup
    from selenium import webdriver
    driver = webdriver.Chrome()
    url = "https://shop.nordstrom.com/c/womens-dresses-shop?origin=topnav&cm_sp=Top%20Navigation-_-Women-_-Dresses&offset=11&page=3&top=72"
    driver.get(url)
    soup = BeautifulSoup(driver.page_source, 'html.parser')
    driver.quit()
    print(soup.prettify())
    

    For other solutions see my answer to Scraping Google Finance (BeautifulSoup)

    0 讨论(0)
提交回复
热议问题