Library to extract data from open Excel workbooks

拟墨画扇 提交于 2020-01-24 01:54:07

问题


I am trying to extract data from workbooks that are already open.

I have found the xlrd library, but it appears you can only use this with workbooks you open through Python. The workbooks I will use in my project have already been opened, so this method is unusable.

A second library I found, which is OpenPyxl, only returns errors for me, even though the workbook is open:

from openpyxl import load_workbook

wb = load_workbook(filename = 'Components V2.4.3.xlsm')

returns:

FileNotFoundError: [Errno 2] No such file or directory: 'Components V2.4.3.xlsm'

Lastly, I have used win32com.client's Dispatch which I could not get cell values from, hence why I am looking for an alternative.

Am I doing something wrong with openpyxl, or is there another method I can use?


回答1:


Open a workbook test.xlsx currently open in Excel, and read the value in cell A1 of the first worksheet:

from win32com.client import GetObject
xl = GetObject(None, "Excel.Application")
wb = xl.Workbooks("test.xlsx")
ws = wb.Sheets(1)
ws.Cells(1, 1).Value

Read a range as a tuple of tuples:

ws.Range("A1:D4").Value

Write back some values:

ws.Range("A1:D4").Value = [[16, 3, 2, 13], [5, 10, 11, 8], [9, 6, 7, 12], [4, 15, 14, 1]]

Answer to the comments: COM (Component Object Model), sometimes referred to as "Automation", allows a Windows application to provide a "COM server", which gives access to some of its APIs, to be accessed from a "COM client". Excel has such a server (and VBA has a client: you may use CreateObject and GetObject from VBA).

Other applications offer similar services through COM: for instance MATLAB, SAS, Stata, and all applications of Microsoft Office.

Python has a client with pywin32. You may also develop a server with Pywin32, see for instance this: Portable Python com server using pywin32

Note that in the case of Excel, as you noticed, you may access most of the object hierarchy, and control very precisely the behavior of Excel. Basically, if you can do it in VBA, you can do it from any COM client.


Regarding the last row of a range, I'm not sure I understand what you want. Is it this: Excel VBA Find last row in range ?


A few more points:

If Excel is not already open, you can still open a connection to Excel. In VBA the function to do this is CreateObject instead of GetObject, but in Python it's Dispatch:

from win32com.client import Dispatch
xl = Dispatch("Excel.Application")
xl.WorksheetFunction.Gamma(0.5)

In VBA you will often use Excel "constants", such as xlUp. They are available in Python too, with this (after starting the connection with Excel, with GetObject or Dispatch):

from win32com.client import constants as const
const.xlUp

To connect to a COM server installed on your computer, you need the name of the object to get. Here are a few cases:

For Microsoft Office:

  • Excel.Application
  • Word.Application
  • Outlook.Application
  • Powerpoint.Application
  • Access.Application
  • Publisher.Application
  • Visio.Application

Often used in VBScript:

  • Scripting.Dictionary
  • Scripting.FileSystemObject
  • WScript.Shell

Specialized software:

  • Matlab.Application
  • SAS.Application
  • stata.StataOLEApp

Last remark: as explained here, you can find the documentation of Pywin32 either in the directory where it's installed ([Pythonpath]\Lib\site-packages\PyWin32.chm), or on the web here: http://timgolden.me.uk/pywin32-docs/contents.html



来源:https://stackoverflow.com/questions/57526586/library-to-extract-data-from-open-excel-workbooks

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!