问题
I have folder ('rogaikopyta') with 1 000 files in xlsx format. I need to extract the data form each of these files (B2 and D2 cells) and print them in ordered consequence in one xlsx file. The original code is:
import openpyxl
import os
import pathlib
from openpyxl import load_workbook, Workbook
path = 'C:/Users/User/Documents/Visual Studio 2017/DjangoWebProject1/DjangoWebProject1/app/rogaikopyta'
for file in os.listdir(path):
wb=load_workbook(os.path.join(path,file), read_only=True)
ws=wb.active
wb2 = Workbook(write_only=True)
ws2 = wb2.create_sheet()
for row in ws.iter_rows(min_col=2, max_col=4, min_row=2, max_row=2, values_only=True):
ws2.append([row[0], row[-1]])
wb2.save("output.xlsx")
It gives a loooot of errors, I suppose, concerning each of the xlsx files!
Like this:
Exception ignored in:
Traceback (most recent call last):
File "C:\Users\User\PycharmProjects\untitled3\venv\lib\site-packages\openpyxl\worksheet_write_only.py", line 74, in _write_rows
pass
File "C:\Users\User\AppData\Local\Programs\Python\Python37-32\lib\contextlib.py", line 119, in exit
next(self.gen)
File "C:\Users\User\PycharmProjects\untitled3\venv\lib\site-packages\et_xmlfile\xmlfile.py", line 50, in element
self._write_element(el)
File "C:\Users\User\PycharmProjects\untitled3\venv\lib\site-packages\et_xmlfile\xmlfile.py", line 50, in element
self._write_element(el)
File "C:\Users\User\PycharmProjects\untitled3\venv\lib\site-packages\et_xmlfile\xmlfile.py", line 78, in _write_element
self._file.write(xml)
ValueError: write to closed file
And it bothers me a lot! What is goin on with this code?
回答1:
You should probably read up on the os
module and the pathlib
module, as well as checking that your code is suitable for the more recent versions of openpyxl.
os.listdir(path)
returns a list of files in a folder so the following should see you on your way
from openpyxl import load_workbook, Workbook
target_wb = Workbook()
target_ws = wb.active
for file in os.listdir(path):
wb = load_workbook(os.path.join(path, file), read_only=True)
ws = wb.active
for row in ws.iter_rows(min_col=2, max_col=4, min_row=2, max_row=2, values_only=True):
target_ws.append([row[0], row[-1])
target_wb.save("output.xlsx")
回答2:
Your code now looks like:
import openpyxl
import os
import pathlib
from openpyxl import load_workbook, Workbook
path = 'C:/Users/User/Documents/Visual Studio 2017/DjangoWebProject1/DjangoWebProject1/app/rogaikopyta'
for file in os.listdir(path):
wb=load_workbook(os.path.join(path,file), read_only=True)
ws=wb.active
wb2 = Workbook(write_only=True)
ws2 = wb2.create_sheet()
for row in ws.iter_rows(min_col=2, max_col=4, min_row=2, max_row=2, values_only=True):
ws2.append([row[0],row[-1]])
wb2.save("output.xlsx")
But still it gives an access error, refering to the 1000 files:
Exception ignored in:
Traceback (most recent call last):
File "C:\Users\User\PycharmProjects\untitled3\venv\lib\site-packages\openpyxl\worksheet_write_only.py", line 74, in _write_rows
pass
File "C:\Users\User\AppData\Local\Programs\Python\Python37-32\lib\contextlib.py", line
119, in exit next(self.gen)
File "C:\Users\User\PycharmProjects\untitled3\venv\lib\site-packages\et_xmlfile\xmlfile.py", line 50, in element self._write_element(el)
File "C:\Users\User\PycharmProjects\untitled3\venv\lib\site-packages\et_xmlfile\xmlfile.py", line 78, in _write_element self._file.write(xml)
ValueError: write to closed file
Exception ignored in: Traceback (most recent call last):
File "C:\Users\User\PycharmProjects\untitled3\venv\lib\site-packages\openpyxl\worksheet_write_only.py", line 74, in _write_rows pass
If I put: for row in ws.iter_rows(min_col=2, max_col=4, min_row=2, max_row=2, values_only=True):
ws2.append([row[0],row[-1]])
wb2.save("output.xlsx")
it gives the result for only one cycle, i.e. for only one file xlsx, i need for all 1000 files in the folder. Can you please suggest what is going on?
来源:https://stackoverflow.com/questions/62284979/to-process-a-folder-with-xlsx-files-with-openpyxl-module