I have an xlsx file, with columns with various coloring.
I want to read only the white columns of this excel in python using pandas, but I have no clues on hot to d
(Disclosure: I'm one of the authors of the library I'm going to suggest)
With StyleFrame (that wraps pandas) you can read an excel file into a dataframe without loosing the style data.
Consider the following sheet:
And the following code:
from styleframe import StyleFrame, utils
# from StyleFrame import StyleFrame, utils (if using version < 3.X)
sf = StyleFrame.read_excel('test.xlsx', read_style=True)
print(sf)
# b p y
# 0 nan 3 1000.0
# 1 3.0 4 2.0
# 2 4.0 5 42902.72396767148
sf = sf[[col for col in sf.columns
if col.style.fill.fgColor.rgb in ('FFFFFFFF', utils.colors.white)]]
# "white" can be represented as 'FFFFFFFF' or
# '00FFFFFF' (which is what utils.colors.white is set to)
print(sf)
# b
# 0 nan
# 1 3.0
# 2 4.0