Read merged cells in Excel with Python

匿名 (未验证) 提交于 2019-12-03 03:10:03

问题:

I am trying to read merged cells of Excel with Python using xlrd.

My Excel: (note that the first column is merged across the three rows)

    A   B   C   +---+---+----+ 1 | 2 | 0 | 30 |   +   +---+----+ 2 |   | 1 | 20 |   +   +---+----+ 3 |   | 5 | 52 |   +---+---+----+ 

I would like to read the third line of the first column as equal to 2 in this example, but it returns ''. Do you have any idea how to get to the value of the merged cell?

My code:

all_data = [[]] excel = xlrd.open_workbook(excel_dir+ excel_file) sheet_0 = excel.sheet_by_index(0) # Open the first tab  for row_index in range(sheet_0.nrows):     row= ""     for col_index in range(sheet_0.ncols):         value = sheet_0.cell(rowx=row_index,colx=col_index).value                      row += "{0} ".format(value)         split_row = row.split()        all_data.append(split_row) 

What I get:

'2', '0', '30' '1', '20' '5', '52' 

What I would like to get:

'2', '0', '30' '2', '1', '20' '2', '5', '52' 

回答1:

I just tried this and it seems to work for your sample data:

all_data = [] excel = xlrd.open_workbook(excel_dir+ excel_file) sheet_0 = excel.sheet_by_index(0) # Open the first tab  prev_row = [None for i in range(sheet_0.ncols)] for row_index in range(sheet_0.nrows):     row= []     for col_index in range(sheet_0.ncols):         value = sheet_0.cell(rowx=row_index,colx=col_index).value         if len(value) == 0:             value = prev_row[col_index]         row.append(value)     prev_row = row     all_data.append(row) 

returning

[['2', '0', '30'], ['2', '1', '20'], ['2', '5', '52']] 

It keeps track of the values from the previous row and uses them if the corresponding value from the current row is empty.

Note that the above code does not check if a given cell is actually part of a merged set of cells, so it could possibly duplicate previous values in cases where the cell should really be empty. Still, it might be of some help.

Additional information:

I subsequently found a documentation page that talks about a merged_cells attribute that one can use to determine the cells that are included in various ranges of merged cells. The documentation says that it is "New in version 0.6.1", but when i tried to use it with xlrd-0.9.3 as installed by pip I got the error

NotImplementedError: formatting_info=True not yet implemented

I'm not particularly inclined to start chasing down different versions of xlrd to test the merged_cells feature, but perhaps you might be interested in doing so if the above code is insufficient for your needs and you encounter the same error that I did with formatting_info=True.



回答2:

You can also try using fillna method available in pandas https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.fillna.html

excel = pd.read_excel(dir+filename,header=1) excel[ColName]=excel[ColName].fillna(method='ffill') 

This should replace the cell's value with the previous value



回答3:

openpyxl.worksheet.merged_cell_ranges 

This function you can get a array like ['A1:M1', 'B22:B27'], which tell you the cells to be merged.

openpyxl.worksheet.merged_cells 

This function shows you whether a cell has been merged or not



标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!