Pandas 0.22.0: IndexError: list index out of range when reading xls

て烟熏妆下的殇ゞ 提交于 2019-12-11 01:47:08

问题


I'm trying to load a 282Mb (65536 rows x 138 columns) .xls file into a pandas dataframe

import pandas as pd
import os

filename = r'invoicing.xls'
dir = os.path.dirname(os.path.abspath(filename))
excelFile = os.path.join(dir, filename)
invoicing_info = pd.read_excel(excelFile)

But I'm getting this

Traceback (most recent call last):
  File "/Users/juanda/conda_envs/Hyperion_contracts_env/lib/python3.6/site-packages/xlrd/sheet.py", line 698, in put_cell_unragged
    self._cell_types[rowx][colx] = ctype
IndexError: list index out of range

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/juanda/PycharmProjects/Hyperion_info/load_info.py", line 11, in <module>
    invoicing_info = pd.read_excel(excelFile, sheet_name=0)
  File "/Users/juanda/conda_envs/Hyperion_contracts_env/lib/python3.6/site-packages/pandas/util/_decorators.py", line 118, in wrapper
    return func(*args, **kwargs)
  File "/Users/juanda/conda_envs/Hyperion_contracts_env/lib/python3.6/site-packages/pandas/io/excel.py", line 230, in read_excel
    io = ExcelFile(io, engine=engine)
  File "/Users/juanda/conda_envs/Hyperion_contracts_env/lib/python3.6/site-packages/pandas/io/excel.py", line 294, in __init__
    self.book = xlrd.open_workbook(self._io)
  File "/Users/juanda/conda_envs/Hyperion_contracts_env/lib/python3.6/site-packages/xlrd/__init__.py", line 162, in open_workbook
    ragged_rows=ragged_rows,
  File "/Users/juanda/conda_envs/Hyperion_contracts_env/lib/python3.6/site-packages/xlrd/book.py", line 107, in open_workbook_xls
    bk.fake_globals_get_sheet()
  File "/Users/juanda/conda_envs/Hyperion_contracts_env/lib/python3.6/site-packages/xlrd/book.py", line 728, in fake_globals_get_sheet
    self.get_sheets()
  File "/Users/juanda/conda_envs/Hyperion_contracts_env/lib/python3.6/site-packages/xlrd/book.py", line 719, in get_sheets
    self.get_sheet(sheetno)
  File "/Users/juanda/conda_envs/Hyperion_contracts_env/lib/python3.6/site-packages/xlrd/book.py", line 710, in get_sheet
    sh.read(self)
  File "/Users/juanda/conda_envs/Hyperion_contracts_env/lib/python3.6/site-packages/xlrd/sheet.py", line 1361, in read
    self_put_cell(rowx, colx, None, d, self.fixed_BIFF2_xfindex(cell_attr, rowx, colx))
  File "/Users/juanda/conda_envs/Hyperion_contracts_env/lib/python3.6/site-packages/xlrd/sheet.py", line 709, in put_cell_unragged
    assert 1 <= nr <= self.utter_max_rows
AssertionError

I think this is a problem with the .xls extension but I can't modify the file before uploading it. How can I upload this .xls file in a reliable manner?


回答1:


I was having the same issue. After i copied and pasted "values only" into a new sheet, move the sheet around (order of the sheet), it is working now. This is annoying.



来源:https://stackoverflow.com/questions/48669253/pandas-0-22-0-indexerror-list-index-out-of-range-when-reading-xls

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!