Reorganizing dataframe with multiple header types following “tidy” approach in R

心不动则不痛 提交于 2019-12-12 07:01:33

问题


I have a dataframe that looks like somewhat like this:

Age  A1U_sweet  A2F_dip  A3U_bbq  C1U_sweet  C2F_dip  C3U_bbq  Comments
23   1          2        1        NA         NA       NA       Good
54   NA         NA       NA       4          1        2        ABCD
43   2          4        7        NA         NA       NA       HiHi

I am trying to reorganize it in way shown below to make it more "tidy". Is there a way for me to do this that also incorporates the Age and Comments columns in the same style as shown for the other variables below? How would you suggest incorporating them - one idea is shown below, but I am open to other suggestions. How would I modify the following code in order to account for multiple different styles of column name?

library(tidyr)

df <- data.frame(id = 1:nrow(df), df)
dfl <- gather(df, key = "key", value = "value", -id)
dfl <- separate(dfl, key, into = c("key", "kind", "type"), sep = c(1, 4))
df2 <- spread(dfl, key, value)
df2
##   id kind  type     A    C
## 1  1  Age   Age    23   23
## 2  1  1U_ sweet     1   NA
## 3  1  2F_   dip     2   NA
## 4  1  3U_   bbq     1   NA
## 5  1  Com   Com  Good Good
## 6  2  Age   Age    54   54
## 7  2  1U_ sweet    NA    4
## 8  2  2F_   dip    NA    1
## 9  2  3U_   bbq    NA    2
##10  2  Com   Com  ABCD ABCD
##11  3  Age   Age    43   43
##12  3  1U_ sweet     2   NA
##13  3  2F_   dip     4   NA
##14  3  3U_   bbq     7   NA
##15  3  Com   Com  HiHi HiHi

And how would I modify the following code to return the data back to how it originally was?

df <- gather(df2, key = "key", value = "value", A, B, C)
df <- unite(df, "key", key, kind, type, sep = "")
df <- spread(df, key, value)

For context, this question was prompted by Ista's comment under this question: Combining columns in R based on matching beginnings of column title names


回答1:


Since Age and Comments are presumably measured at the level of whatever a row in your original data is, just bring them along for the ride:

df <- data.frame(id = 1:nrow(df), df)

dfl <- gather(df, key = "key", value = "value", -id, -Age, -Comments)
dfl <- separate(dfl, key, into = c("key", "kind", "type"), sep = c(1, 4))
df2 <- spread(dfl, key, value)
df2

df2 <- transform(df2, B = ifelse(is.na(A), C, A))
df2

df <- gather(df2, key = "key", value = "value", A, B, C)
df <- unite(df, "key", key, kind, type, sep = "")
df <- spread(df, key, value)
df


来源:https://stackoverflow.com/questions/48717310/reorganizing-dataframe-with-multiple-header-types-following-tidy-approach-in-r

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!