How to delete rows that satisfy some criteria in an excel spreadsheet?

早过忘川 提交于 2020-01-12 13:53:31

问题


I would like to create a "reduced" version of an Excel (xlsx) spreadsheet (i.e. by removing some rows according to some criterion), and I'd like to know if this can be done with openpyxl.

In (pythonish) pseudo-code, what I want to do would look something like:

wb = openpyxl.reader.excel.load_workbook('/path/to/workbook.xlsx')
sh = wb.get_sheet_by_name('someworksheet')

# weed out the rows of sh according to somecriterion
sh.rows[:] = [r for r in sh.rows if somecriterion(r)]

# save the workbook, with the weeded-out sheet
wb.save('/path/to/workbook_reduced.xlsx')

Can something like this be done with openpyxl, and if so, how?


回答1:


2018 update: I was searching how to delete a row today and found that the functionality is added in openpyxl 2.5.0-b2. Just tried and it worked perfectly. Here's the link where I found the answer: https://bitbucket.org/openpyxl/openpyxl/issues/964/delete_rows-does-not-work-on-deleting

And here's the syntax to delete one row:

ws.delete_rows(index, 1)

where: 'ws' is the worksheet, 'index' is the row number, and '1' is the number of rows to delete.

There's also the ability to delete columns, but I haven't tried that.




回答2:


Internally openpyxl does not seem to have a concept of 'rows' it works with cells and keeps track of the dimensions and if you use Worksheet.rows it calculates a 2D array of cells from that. You can mutate the array, but it doesn't change the Worksheet.

If you want to do this within the Worksheet, you need to copy the values from the old position to the new position, and set the value of the cells that are no longer needed to '' or None and call Worksheet.garbage_collect().

If your dataset is small and of uniform nature (all strings e.g.), you might be better of copying the relevant cell (content) to a new worksheet, remove the old one and set the title of the new one to the title of the just deleted one.

The most elegant thing to do, IMHO, would be to extend Worksheet or a subclass with a delete_rows method. I would implement such a method by changing the coordinates of its Cells in place. But this could break if openpyxl internals change.



来源:https://stackoverflow.com/questions/14904977/how-to-delete-rows-that-satisfy-some-criteria-in-an-excel-spreadsheet

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!