问题
I have a data.frame
that looks like this:
dfTall <- frame_data(
~id, ~x, ~y, ~z,
1, "a", 4, 5,
1, "b", 6, 5,
2, "a", 5, 4,
2, "b", 1, 9)
I want to turn it into this:
dfWide <- frame_data(
~id, ~y_a, ~y_b, ~z_a, ~z_b,
1, 4, 6, 5, 5,
2, 5, 1, 4, 9)
Currently, I'm doing this
dfTall %>%
split(., .$x) %>%
mapply(function(df,name)
{df$x <- NULL; names(df) <- paste(names(df), name, sep='_'); df},
SIMPLIFY=FALSE, ., names(.)) %>%
bind_cols() %>%
select(-id_b) %>%
rename(id = id_a)
In practice, I will have a larger number of numeric columns that need to be expanded (i.e., not just y
and z
). My current solution works, but it has issues, like the fact that multiple copies of the id
variable get added into the final data.frame
and need to be removed.
Can this expansion be done using a function from tidyr
such as spread
?
回答1:
It can be done with spread
but not in a single step, as it involves multiple columns as values; You can firstly gather
the value columns, unite
the headers manually and then spread
:
library(dplyr)
library(tidyr)
dfTall %>%
gather(col, val, -id, -x) %>%
unite(key, col, x) %>%
spread(key, val)
# A tibble: 2 x 5
# id y_a y_b z_a z_b
#* <dbl> <dbl> <dbl> <dbl> <dbl>
#1 1 4 6 5 5
#2 2 5 1 4 9
If you use data.table
, dcast
supports cast multiple value columns:
library(data.table)
dcast(setDT(dfTall), id ~ x, value.var = c('y', 'z'))
# id y_a y_b z_a z_b
#1: 1 4 6 5 5
#2: 2 5 1 4 9
来源:https://stackoverflow.com/questions/45872076/expanding-columns-associated-with-a-categorical-variable-into-multiple-columns-w