Reading a portion of a large xlsx file with python

后端 未结 2 1261
野性不改
野性不改 2020-12-06 21:40

I have a large .xlsx file with 1 million rows. I don\'t want to open the whole file in one go. I was wondering if I can read a chunk of the file, process it and then read th

2条回答
  •  感情败类
    2020-12-06 22:38

    UPDATE: 2019-09-05

    The chunksize parameter has been deprecated as it wasn't used by pd.read_excel(), because of the nature of XLSX file format, which will be read up into memory as a whole during parsing.

    There are more details about that in this great SO answer...


    OLD answer:

    you can use read_excel() method:

    chunksize = 10**5
    for chunk in pd.read_excel(filename, chunksize=chunksize):
        # process `chunk` DF
    

    if your excel file has multiple sheets, take a look at bpachev's solution

提交回复
热议问题