I have a large table from the web, accessed via requests and parsed with BeautifulSoup. Part of it looks something like this:
&l
-
You could use regular expressions to modify the text first and remove the html tags:
import re, pandas as pd
tbl = """
265
JonesBlue
29
266
Smith
34
"""
tbl = re.sub('(.*?)', '\\1 \\2', tbl)
pd.read_html(tbl)
which gives you
[ 0 1 2
0 265 /j/jones03.shtml JonesBlue 29
1 266 /s/smith01.shtml Smith 34]
- 热议问题