How to read CSV file from GitHub using pandas

一世执手 提交于 2020-12-05 05:01:45

问题


Im trying to read CSV file thats on github with Python using pandas> i have looked all over the web, and I tried some solution that I found on this website, but they do not work. What am I doing wrong?

I have tried this:

import pandas as pd

url = 'https://github.com/lukes/ISO-3166-Countries-with-Regional-Codes/blob/master/all/all.csv'
df = pd.read_csv(url,index_col=0)
#df = pd.read_csv(url)

print(df.head(5))

回答1:


You should provide URL to raw content. Try using this:

import pandas as pd

url = 'https://raw.githubusercontent.com/lukes/ISO-3166-Countries-with-Regional-Codes/master/all/all.csv'
df = pd.read_csv(url, index_col=0)
print(df.head(5))

Output:

               alpha-2           ...            intermediate-region-code
name                             ...                                    
Afghanistan         AF           ...                                 NaN
Åland Islands       AX           ...                                 NaN
Albania             AL           ...                                 NaN
Algeria             DZ           ...                                 NaN
American Samoa      AS           ...                                 NaN



回答2:


Add ?raw=true at the end of the GitHub URL to get the raw file link.

In your case,

import pandas as pd
url = 'https://github.com/lukes/ISO-3166-Countries-with-Regional-Codes/blob/master/all/all.csv?raw=true'
df = pd.read_csv(url,index_col=0)
#df = pd.read_csv(url)

print(df.head(5))

Note: This works only with GitHub links and not with GitLab or Bitbucket links.




回答3:


I recommend to either use pandas as you tried to and others here have explained, or depending on the application, the python csv-handler CommaSeperatedPython, which is a minimalistic wrapper for the native csv-library.

The library returns the contents of a file as a 2-Dimensional String-Array. It's is in its very early stage though, so if you want to do large scale data-analysis, I would suggest Pandas.




回答4:


You can copy/paste the url and change 2 things:

  1. Remove "blob"
  2. Replace github.com by raw.githubusercontent.com

For instance this link:

https://github.com/mwaskom/seaborn-data/blob/master/iris.csv

Works this way:

import pandas as pd

pd.read_csv('https://raw.githubusercontent.com/mwaskom/seaborn-data/master/iris.csv')


来源:https://stackoverflow.com/questions/55240330/how-to-read-csv-file-from-github-using-pandas

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!