问题
Iam working on a excel workbook that needs to find median of all column values where another column value is a specific string using openpyxl. For this purpose, I have used MEDIAN
and IF
combined, where Ctrl+Shift+Enter needs to be pressed for it to be considered as an Array formula.
But openpyxl doesn't keep this array formula's nature while saving the file.
Example Data:
Values | IDS
3.5 | 1234
2.5 | 1234
6.5 | 5687
7.5 | 1234
9.5 | 1234
1.0 | 7894
For the above data, I should be able to only find median of values whose adjacent id=1234.
I have to use openpyxl and so need a non-array formula to loop through list of arrays with ids and find the medians of each matching id.
回答1:
openpyxl supports limited parsing of formulas embedded in cells. The openpyxl.formula package contains a Tokenizer class to break formulas into their constituent tokens. A token in an Excel formula.
Tokens have three attributes:
- value: The string value parsed that led to this token (The actual formula, in our Median-IF formula as you would write in excel.)
- type: A string identifying the type of token
- subtype: A string identifying subtype of the token (optional, and defaults to “”)
Your Median-If formula could be different, the code below is for reference only. It suggest how to parse (or say, validate) the array formulas using openpyxl. Usage is as follows:
>>> from openpyxl.formula import Tokenizer
>>> tok = Tokenizer("""{=MEDIAN(IF($B$1:$B$6="1234",$A$1:$A$6,""))}""")
>>> tok.type = Token.ARRAY
>>> tok.parse()
Kindly note that openpyxl support either the formula or the value of the formula which one can select using the data_only flag when opening a workbook. However, openpyxl does not and will not calculate the result of a formula. Use either of the following for a greater control to Excel in Python:
- pycel
- xlwings
Attribution: openpyxl documentation (Tokenizer)
来源:https://stackoverflow.com/questions/52488989/keep-the-nature-of-array-formulas-when-using-openpyxl