BeautifulSoup - only returning first table

后端 未结 1 562
执笔经年
执笔经年 2020-12-20 06:24

I\'ve been working with BeautifulSoup lately. I\'m trying to get the data from https://www.pro-football-reference.com/teams/mia/2000_roster.htm site. Specifically all I want

相关标签:
1条回答
  • 2020-12-20 06:53

    To get the content from that table you need to use any browser simulator cause the response of that portion is generated dynamically. Data from the first table can easily be accessible without any browser simulator, though. I tried selenium in this case:

    from bs4 import BeautifulSoup
    from selenium import webdriver
    
    driver = webdriver.Chrome()
    page_url = "https://www.pro-football-reference.com/teams/mia/2000_roster.htm"
    driver.get(page_url)
    soup = BeautifulSoup(driver.page_source, "lxml")
    table = soup.select(".table_outer_container")[1]
    for items in table.select("tr"):
        player = items.select("[data-stat='player']")[0].text
        gs = items.select("[data-stat='gs']")[0].text
        print(player,gs)
    
    driver.quit()
    

    Partial output:

    Player  GS
    Trace Armstrong* 0
    John Bock 1
    Tim Bowens 15
    Lorenzo Bromell 0
    Autry Denson 0
    Mark Dixon 15
    Kevin Donnalley 16
    

    For some reason if you encounter such error, this time there will be no such option for that error either:

    from bs4 import BeautifulSoup
    from selenium import webdriver
    
    driver = webdriver.Chrome()
    page_url = "https://www.pro-football-reference.com/teams/mia/2000_roster.htm"
    driver.get(page_url)
    soup = BeautifulSoup(driver.page_source, "lxml")
    table = soup.select(".table_outer_container")[1]
    for items in table.select("tr"):
        player = items.select("[data-stat='player']")[0].text if items.select("[data-stat='player']") else ""
        gs = items.select("[data-stat='gs']")[0].text if items.select("[data-stat='gs']") else ""
        print(player,gs)
    
    driver.quit()
    
    0 讨论(0)
提交回复
热议问题