问题
I have to check the data which contain "strikethrough" format when importing excel file in R
Do we have any method to detect them ? Welcome for both R and Python approach
回答1:
R-solution
the tidyxl
-package can help you...
example test.xlsx, with data on A1:A4 of the first sheet. Below is an excel-screenshot:
library(tidyxl)
formats <- xlsx_formats( "temp.xlsx" )
cells <- xlsx_cells( "temp.xlsx" )
strike <- which( formats$local$font$strike )
cells[ cells$local_format_id %in% strike, 2 ]
# A tibble: 2 x 1
# address
# <chr>
# 1 A2
# 2 A4
回答2:
I present below a small sample program that filters out text with strikethrough applied, using the openpyxl package (I tested it on version 2.5.6 with Python 3.7.0). Sorry it took so long to get back to you.
import openpyxl as opx
from openpyxl.styles import Font
def ignore_strikethrough(cell):
if cell.font.strike:
return False
else:
return True
wb = opx.load_workbook('test.xlsx')
ws = wb.active
colA = ws['A']
fColA = filter(ignore_strikethrough, colA)
for i in fColA:
print("Cell {0}{1} has value {2}".format(i.column, i.row, i.value))
print(i.col_idx)
I tested it on a new workbook with the default worksheets, with the letters a,b,c,d,e in the first five rows of column A, where I had applied strikethrough formatting to b and d. This program filters out the cells in columnA which have had strikethrough applied to the font, and then prints the cell, row and values of the remaining ones. The col_idx property returns the 1-based numeric column value.
回答3:
I found a method below:
'# Assuming the column from 1 - 10 has value : A , the 5th A contains "strikethrough"
TEST_wb = load_workbook(filename = 'TEST.xlsx')
TEST_wb_s = TEST_wb.active
for i in range(1, TEST_wb_s.max_row+1):
ck_range_A = TEST_wb_s['A'+str(i)]
if ck_range_A.font.strikethrough == True:
print('YES')
else:
print('NO')
But it doesn't tell the location (this case is the row numbers),which is hard for knowing where contains "strikethrough" when there is a lot of result , how can i vectorize the result of statement ?
来源:https://stackoverflow.com/questions/52107584/how-to-detect-strikethrough-style-from-xlsx-file-in-r