Hi I\'m going through Python for Data analysis and I\'d like to analyze the data he goes through in the book. In chapter 9, he uses the data below. However, I\'m having a di
You should be able to just use the url
of the raw version (a link to the raw version is a button on the link you provided) and then read it into a dataframe directly using read_csv
:
import pandas as pd
url = 'https://raw.githubusercontent.com/pydata/pydata-book/master/ch09/stock_px.csv'
df = pd.read_csv(url,index_col=0,parse_dates=[0])
print df.head(5)
AAPL MSFT XOM SPX
2003-01-02 7.40 21.11 29.22 909.03
2003-01-03 7.45 21.14 29.24 908.59
2003-01-06 7.45 21.52 29.96 929.01
2003-01-07 7.43 21.93 28.95 922.93
2003-01-08 7.28 21.31 28.83 909.93
Edit: a brief explanation about the options I used to read in the file:
df = pd.read_csv(url,index_col=0,parse_dates=[0])
The first column (column = 0) is a column of dates in the file and because it had no column name it looked like it was meant to be the index; index_col=0
makes it the index and parse_dates[0] tells read_csv to parse column=0 (the first column) as dates.