Scraping graph data from a website using Python

…衆ロ難τιáo~ 提交于 2019-12-22 01:34:42

问题


Is it possible to capture the graph data from a website? For example the website here, has a number of plots. Is is possible to capture these data using Python code?


回答1:


Looking at the page source of the link you provided, the chart data is available directly in JSON format through the link. http://www.fbatoolkit.com/chart_data/1414978499.87

So your scraper might want to do something like this:

import requests
import re

r = requests.get('http://www.fbatoolkit.com')
data_link = b'http://www.fbatoolkit.com/' + re.search(b'chart_data/[^"]*', r.content).group()
data_string = requests.get(data_link).content.decode('utf-8')
chart_data = eval(data_string.replace('window.chart_data =', '').replace(';\n',''))

(Edit to explain my process for finding the link) When I approach a problem like this, the first thing I do is view the page source, (ctrl-u in Chrome for Windows). I searched around for something related to drawing the charts, until I found the following javascript

  function make_containers(i){
        var chart = chart_data[i];

I then did a search through the source for where they defined the variable chart_data. I couldn't find this, but I did find the line

<script type="text/javascript" src="/chart_data/1414978499.87"></script>

Following this link, (you can just click on it in the view souce page in Chrome) I could see that this was a one-line piece of javascript which defines this variable. (Notice that in the last line of my example code I had to make a little change to this file to get it to evaluate in Python).



来源:https://stackoverflow.com/questions/30497537/scraping-graph-data-from-a-website-using-python

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!