scraping data from wikipedia table

前端 未结 3 1404
庸人自扰
庸人自扰 2020-12-18 17:24

I\'m just trying to scrape data from a wikipedia table into a panda dataframe.

I need to reproduce the three columns: \"Postcode, Borough, Neighbourhood\".



        
3条回答
  •  一个人的身影
    2020-12-18 18:05

    You may be overthinking the problem, if you only want the script to pull one table from the page. One import, one line, no loops:

    import pandas as pd
    url='https://en.wikipedia.org/wiki/List_of_postal_codes_of_Canada:_M'
    
    df=pd.read_html(url, header=0)[0]
    
    df.head()
    
        Postcode    Borough         Neighbourhood
    0   M1A         Not assigned    Not assigned
    1   M2A         Not assigned    Not assigned
    2   M3A         North York      Parkwoods
    3   M4A         North York      Victoria Village
    4   M5A         Downtown Toronto    Harbourfront
    

提交回复
热议问题