How to work with data from NBA.com?

流过昼夜 提交于 2019-12-06 11:18:16

Evidently the data structure has changed since Greg Reda wrote that post. Before exploring the data, I recommend that you save it to a file via pickling. That way you don't have to keep hitting the NBA server and waiting for a download each time you modify and rerun the script.

The following script checks for the existence of the pickled data to avoid unnecessary downloading:

import requests
import json

url = 'http://stats.nba.com/stats/leaguedashteamshotlocations?Conference=&DateFr' + \
      'om=&DateTo=&DistanceRange=By+Zone&Division=&GameScope=&GameSegment=&LastN' + \
      'Games=0&LeagueID=00&Location=&MeasureType=Opponent&Month=0&OpponentTeamID' + \
      '=0&Outcome=&PORound=0&PaceAdjust=N&PerMode=PerGame&Period=0&PlayerExperien' + \
      'ce=&PlayerPosition=&PlusMinus=N&Rank=N&Season=2014-15&SeasonSegment=&Seas' + \
      'onType=Regular+Season&ShotClockRange=&StarterBench=&TeamID=0&VsConference=&VsDivision='
print(url)

import sys, os, pickle
file_name = 'result_sets.pickled'

if os.path.isfile(file_name):
  result_sets = pickle.load(open(file_name, 'rb'))
else: 
  response = requests.get(url)
  response.raise_for_status()
  result_sets = response.json()['resultSets']
  pickle.dump(result_sets, open(file_name, 'wb'))

print(result_sets.keys())
print(result_sets['headers'][1])
print(result_sets['rowSet'][0])
print(len(result_sets['rowSet']))

Once you have result_sets in hand, you can examine the data. If you print it, you'll see that it's a dictionary. You can extract the dictionary keys:

print(result_sets.keys())

Currently the keys are 'headers', 'rowSet', and 'name'. You can inspect the headers:

print(result_sets['headers'])

I probably know less about these statistics than you do. However, by looking at the data, I've been able to figure out that result_sets['rowSet'] contains 30 rows of 23 elements each. The 23 columns are identified by result_sets['headers'][1]. Try this:

print(result_sets['headers'][1])

That will show you the 23 column names. Now take a look at the first row of team data:

print(result_sets['rowSet'][0])

Now you see the 23 values reported for the Atlanta Hawks. You can iterate over the rows in result_sets['rowSet'] to extract whatever values interest you and to compute aggregate information such as totals and averages.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!