I wanted to know how to read an entire column without iterating from an excel sheet using win32com client for python.
You can read an entire column without iterating from a sheet using the Range
collection. You should never use Cells
if performacne is any concern. Python uses the win32com module to interact with the Excel COM library. Whenever you use Python and COM (Excel, PowerPoint, Acess, ADODB, etc.) one of your biggest performance constraints will be IO between COM and Python. With the Range
method you only make one COM method call while with Cells
you make one for each row. This would also be faster if you were doing the same in VBA or .NET
In the following test I created a worksheet with 10 random characters in cells A1 through A2000. I then extracted these values into lists using both Range and Cells.
import win32com.client
app = win32com.client.Dispatch("Excel.Application")
s = app.ActiveWorkbook.Sheets(1)
def GetValuesByCells():
startTime = time.time()
vals = [s.Cells(r,1).Value for r in range(1,2001)]
return time.time() - startTime
def GetValuesByRange():
startTime = time.time()
vals = [v[0] for v in s.Range('A1:A2000').Value]
return time.time() - startTime
>>> GetValuesByRange()
0.03600001335144043
>>> GetValuesByCells()
5.27400016784668
In this case Range is 2 orders of magnitude faster (146x) faster than Cells. Note that the Range method returns a 2D list where each inner list is a row. The list iteration transposes vals
into a 2D list where the inner list is a column.