how to read only a chunk of csv file fast?

柔情痞子 提交于 2021-02-10 15:13:05

问题


I'm using this answer on how to read only a chunk of CSV file with pandas.

The suggestion to use pd.read_csv('./input/test.csv' , iterator=True, chunksize=1000) works excellent but it returns a <class 'pandas.io.parsers.TextFileReader'>, so I'm converting it to dataframe with pd.concat(pd.read_csv('./input/test.csv' , iterator=True, chunksize=25)) but that takes as much time as reading the file in the first place!

Any suggestions on how to read only a chunk of the file fast?


回答1:


pd.read_csv('./input/test.csv', iterator=True, chunksize=1000) returns an iterator. You can use the next function to grab the next one

reader = pd.read_csv('./input/test.csv', iterator=True, chunksize=1000)

next(reader)

This is often used in a for loop for processing one chunk at a time.

for df in pd.read_csv('./input/test.csv', iterator=True, chunksize=1000):
    pass 


来源:https://stackoverflow.com/questions/50473327/how-to-read-only-a-chunk-of-csv-file-fast

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!