Can't read excel files, using openpyxl

匿名 (未验证) 提交于 2019-12-03 01:45:01

问题:

I have a list of excel files with similar last row. It contains private information about client (his name, surname, phone). Each excel file corresponds to a client. I need to make one excel file with all data about every client. I decide to do it automatically, so looked to openpyxl library. I wrote the following code, but it doesn't work correctly.

import openpyxl import os import glob from openpyxl import load_workbook from openpyxl import Workbook import openpyxl.styles from openpyxl.cell import get_column_letter  path_kit = 'prize_input/kit'  #creating single document prize_info = Workbook() prize_sheet = prize_info.active  file_array_reciever = []  for file in glob.glob(os.path.join(path_kit, '*.xlsx')):     file_array_reciever.append(file)  row_num = 1 for f in file_array_reciever:     f1 = load_workbook(filename=f)     sheet = f1.active     for col_num in range (3, sheet.max_column):         prize_sheet.cell(row=row_num, column=col_num).value = \             sheet.cell(row=sheet.max_row, column=col_num).value      prize_info.save("Ex.xlsx")

I get this error:

Traceback (most recent call last):   File "/Users/zkid18/PycharmProjects/untitled/excel_test.py", line 43, in <module>     f1 = load_workbook(filename=f)   File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/openpyxl/reader/excel.py", line 183, in load_workbook     wb.active = read_workbook_settings(archive.read(ARC_WORKBOOK)) or 0   File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/zipfile.py", line 1229, in read     with self.open(name, "r", pwd) as fp:   File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/zipfile.py", line 1252, in open     zinfo = self.getinfo(name)   File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/zipfile.py", line 1196, in getinfo     'There is no item named %r in the archive' % name) KeyError: "There is no item named 'xl/workbook.xml' in the archive"

Looks like it is a problem with reading file.
I don't understand where it gets an item named 'xl/workbook.xml' in the archive.

回答1:

Depending on which version you are using, this could be a bug in openpyxl. For example, in 1.6.1 a bug was introduced exhibiting this behavior. Reverting to 1.5.8 fixed it. There was a fix according to this openpyxl ticket; though the ticket doesn't say when the fix was delivered, it was committed in early 2013. I upgraded to 1.6.2 and the error went away.



回答2:

You can use xlrd biblioteque

This script allow you to transform a excel data to list of dictionnaries

import xlrd  workbook = xlrd.open_workbook('your_file.xlsx') workbook = xlrd.open_workbook('your_file.xlsx', on_demand = True) worksheet = workbook.sheet_by_index(0) first_row = [] # The row where we stock the name of the column for col in range(worksheet.ncols):     first_row.append( worksheet.cell_value(0,col) ) # tronsform the workbook to a list of dictionnary data =[] for row in range(1, worksheet.nrows):     elm = {}     for col in range(worksheet.ncols):         elm[first_row[col]]=worksheet.cell_value(row,col)     data.append(elm) print data


回答3:

I guess your file is .xls format before, you can use

try:     f1 = load_workbook(filename=f) except:     print f

to find which file cause this error and reopen it in Excel, then save as .xlsx.



回答4:

I found this post searching for a solution to a similar issue, ("There is no item named '[Content_Types].xml' in the archive")

None of this error message makes any sense in terms of my script or the file. My script adds 1 sheet and updates five more in an existing Excel document. While my script was running, I realized I had an error in my code. I canceled my script mid-running.

After canceling, the existing Excel file exhibited this error. Working out bugs with the script, maybe you corrupted your Excel file??

To address this, I'm thinking of creating a temporary restore file in the event of an error using OpenPyXl.



回答5:

I has the same issue, make sure the file you're trying to read isn't open in Excel already



回答6:

If openpyxl still doesn't work, using pandas works.

$ pip install pandas xlrd

And this code works:

import pandas as pd  df = pd.read_excel(file_path)


易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!