How to read data in chunks in Python dataframe?

送分小仙女□ 提交于 2020-01-13 03:41:10

问题


I want to read the file f in chunks to a dataframe. Here is part of a code that I used.

for i in range(0, maxline, chunksize):
df = pandas.read_csv(f,sep=',', nrows=chunksize, skiprows=i)
df.to_sql(member, engine, if_exists='append',index= False, index_label=None, chunksize=chunksize)

I get the error:

pandas.io.common.EmptyDataError: No columns to parse from file

The code works only when the chunksize >= maxline (which is total lines in file f). However, in my case, the chunksize<=maxline.

Please advise the fix.


回答1:


I think it is better to use the parameter chunksize in read_csv. Also, use concat with the parameter ignore_index, because of the need to avoid duplicates in index:

chunksize = 5
TextFileReader = pd.read_csv(f, chunksize=chunksize)

df = pd.concat(TextFileReader, ignore_index=True)

See pandas docs.



来源:https://stackoverflow.com/questions/39384539/how-to-read-data-in-chunks-in-python-dataframe

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!