Openpyxl : need the max number of rows in a column that has data in Excel

问题

I need the last row in a particular column that contains data in Excel. In openpyxl sheet.max_row or max_column gets us the maximum row or column in the whole sheet. But what I want is for a particular column.

My scenario is where I have to get some values from database and append it to the end of a particular column in Excel sheet.

In this screenshot, if I want max_column containing data in column 'C', it should return 10:

In the above image if I want last cell containing data of column 'C', it should return 10

------------- Solution 1 --------------------

import pandas as pd

# lt is the dataframe containing the data to be loaded to excel file

for index,i in enumerate(lt):
   panda_xl_rd = pd.read_excel('file.xlsx',"sheet_Name") # Panda Dataframe
   max = len(panda_xl_rd.iloc[:,(col-1)].dropna())+2     ''' getting the 
                                                             row_num of 
                                                             last record in 
                                                             column 
                                                             dropna removes 
                                                             the Nan 
                                                             values else we 
                                                             will get 
                                                             the entire 
                                                             sheets max 
                                                             column length . 
                                                             +2 gets 
                                                             the next column 
                                                             right after the 
                                                             last column to 
                                                             enter data '''
   cellref = sheet.cell(row = max+index, column=col)
   cellref.value = i
   del panda_xl_rd

------------------------Solution 2 ----------------------

https://stackoverflow.com/a/52816289/10003981

------------------------Solution 3 ----------------------

https://stackoverflow.com/a/52817637/10003981

Maybe solution 3 is a more concise one !!

回答1:

Question: i want max_column containing data in Column 'C' it should return 10:

Simple count cell.value not Empty
Documentation Accessing many cells

PSEUDOCODE

for cell in Column('C'):
    if not cell.value is empty:
        count += 1

Comment: What if we have an empty cell in between?

Count the Rows in sync with the Column Range, and use a maxRowWithData variable. This will also work with no empty cell between.

PSEUDOCODE
for row index, cell in enumerate Column('C'):
    if not cell.value is empty:
        maxRowWithData = row index
Note: The cell index of openpyxl is 1-based!

Documentation: enumerate(iterable, start=0)

回答2:

"Empty" is a relative concept so your code should be clear about this. The methods in openpyxl are guaranteed to return orthogonal result sets: the length of rows and columns will always be the same.

Using this we can work deduce the row highest row in column of a cell where the value is not None.

max_row_for_c = max((c.row for c in ws['C'] if c.value is not None))

None))

回答3:

I think I just found a way using pandas:

import pandas as pd

# lt is the dataframe containing the data to be loaded to excel file

for index,i in enumerate(lt):
    panda_xl_rd = pd.read_excel('file.xlsx',"sheet_Name") # Panda Dataframe
    max = len(panda_xl_rd.iloc[:,(col-1)].dropna())+2     ''' getting the row_num of 
                                                            last record in column 
                                                            dropna removes the Nan 
                                                            values else we will get 
                                                            the entire sheets max 
                                                            column length . +2 gets 
                                                            the next column right 
                                                            after the last column to 
                                                            enter data '''
    cellref = sheet.cell(row = max+index, column=col)
    cellref.value = i
    del panda_xl_rd

回答4:

Simply do this:

columntuple=sheet['A']

without adding row inside, then:

print ( len(colummntuple))

this will give you col length.

For row length:

rowtuple=sheet[0]

This will give you first row with tuple(A1,B1,C1):

len(rowtuple)

来源：https://stackoverflow.com/questions/52813973/openpyxl-need-the-max-number-of-rows-in-a-column-that-has-data-in-excel

标签

python

excel

openpyxl