Context
I am trying to read in and tidy an excel file with multiple headers/sections placed at variable positions. The content of these headers need to
A data.table option.
Similar to @camille's answer, I assume you can make some vector of measures and if the col1
value isn't in that list it's a city. This groups by the cumsum
of not (!
) col1 %in% meas
, i.e. a group number which increments by 1 each time col1
is not found in meas
. Within each group, city
is set as the first
value of col1
and col1
/col2
are renamed appropriately. Then I filter to only rows where city
doesn't equal col1
(now renamed type
) and remove the grouping variable g
.
library(data.table)
setDT(df)
meas <- c("Diesel", "Gasoline", "LPG", "Electric")
df[, .(city = first(col1), type = col1, value = col2),
by = .(g = cumsum(!col1 %in% meas))
][city != type, -'g']
# city type value
# 1: Seattle Diesel 80
# 2: Seattle Gasoline NA
# 3: Seattle LPG 10
# 4: Seattle Electric 10
# 5: Boston Diesel 65
# 6: Boston Gasoline 25
# 7: Boston Electric 10