HTML table to pandas table: Info inside html tags

后端 未结 3 1232
轮回少年
轮回少年 2021-01-04 21:02

I have a large table from the web, accessed via requests and parsed with BeautifulSoup. Part of it looks something like this:


&l         
3条回答
  •  借酒劲吻你
    2021-01-04 21:53

    You could simply parse the table manually like this:

    import BeautifulSoup
    import pandas as pd
    
    TABLE = """
265 JonesBlue 29
266 Smith 34
""" table = BeautifulSoup.BeautifulSoup(TABLE) records = [] for tr in table.findAll("tr"): trs = tr.findAll("td") record = [] record.append(trs[0].text) record.append(trs[1].a["href"]) record.append(trs[2].text) records.append(record) df = pd.DataFrame(data=records) df

which gives you

     0                 1   2
0  265  /j/jones03.shtml  29
1  266  /s/smith01.shtml  34

提交回复
热议问题