Python scrape table from website?

问题

I'd like to scrape every treasury yield rate that is available on treasury.gov website.

https://www.treasury.gov/resource-center/data-chart-center/interest-rates/Pages/TextView.aspx?data=yieldAll

How would I go about taking this information? I'm assuming that I'd have to use BeautifulSoup or Selenium or something like that (preferably BS4). I'd eventually like to put this data in a Pandas DataFrame.

回答1:

Here's one way you can grab the data in a table using requests and beautifulsoup

import pandas as pd
import requests
from bs4 import BeautifulSoup

url = 'https://www.treasury.gov/resource-center/data-chart-center/interest-rates/Pages/TextView.aspx?data=yieldAll'

r = requests.get(url)
html = r.text

soup = BeautifulSoup(html)
table = soup.find('table', {"class": "t-chart"})
rows = table.find_all('tr')
data = []
for row in rows[1:]:
    cols = row.find_all('td')
    cols = [ele.text.strip() for ele in cols]
    data.append([ele for ele in cols if ele])

result = pd.DataFrame(data, columns=['Date', '1 Mo', '2 Mo', '3 Mo', '6 Mo', '1 Yr', '2 Yr', '3 Yr', '5 Yr', '7 Yr', '10 Yr', '20 Yr', '30 Yr'])

print(result)

来源：https://stackoverflow.com/questions/44704099/python-scrape-table-from-website

标签

python

pandas

selenium

dataframe

beautifulsoup

易学教程内所有资源均来自网络或用户发布的内容，如有违反法律规定的内容欢迎反馈！
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!