Gathering specific pairs of columns into rows by dplyr in R [duplicate]

孤人 提交于 2019-12-05 10:46:25

This isn't very scaleable, so if you end up needing more than these 3 pairs of columns, go with @akrun's answer. I just wanted to point out that the bind_rows snippet you included could, in fact, be done in one pipe:

library(tidyverse)


bind_rows(
        df %>% select(id, var, a = a1, b = b1),
        df %>% select(id, var, a = a2, b = b2),
        df %>% select(id, var, a = a3, b = b3)
    ) %>%
    arrange(id, var)
#>    id var a b
#> 1   1   a 3 2
#> 2   1   a 8 1
#> 3   1   a 7 1
#> 4   2   d 5 4
#> 5   2   d 1 6
#> 6   2   d 7 1
#> 7   3   g 1 1
#> 8   3   g 2 4
#> 9   3   g 2 4
#> 10  4   f 2 2
#> 11  4   f 5 7
#> 12  4   f 3 9
#> 13  5   i 2 3
#> 14  5   i 1 2
#> 15  5   i 1 6

Created on 2018-05-07 by the reprex package (v0.2.0).

If you want something that scales and you like map_* functions (from purrr in the tidyverse), you can abstract the above pipeline:

1:3 %>%
    map_df(~select(df, id, var, ends_with(as.character(.))) %>% 
                    setNames(c("id", "var", "a", "b"))) %>%
    arrange(id, var)

where 1:3 just represents the numbers of the pairs you have.

We could do this with melt from data.table which can take multiple patterns in the measure argument to reshape into 'long' format. In this case we are using column names that start (^) with "a" followed by numbers as one pattern and those start with "b" and followed by numbers as other

library(data.table)  
melt(setDT(df), measure = patterns("^a\\d+", "^b\\d+"), 
       value.name = c("a", "b"))[order(id)][, variable := NULL][]
#    id var a b
# 1:  1   a 3 2
# 2:  1   a 8 1
# 3:  1   a 7 1
# 4:  2   d 5 4
# 5:  2   d 1 6
# 6:  2   d 7 1
# 7:  3   g 1 1
# 8:  3   g 2 4
# 9:  3   g 2 4
#10:  4   f 2 2
#11:  4   f 5 7
#12:  4   f 3 9
#13:  5   i 2 3
#14:  5   i 1 2
#15:  5   i 1 6

Or using tidyverse, we gather the columns of interest to 'long' format (but should be cautious when dealing with groups of columns that are having different classes - where melt is more useful), then separate the 'key' column into two, and spread to 'wide' format

library(tidyverse)
df %>% 
  gather(key, val, a1:b3) %>%
  separate(key, into = c("key1", "key2"), sep=1) %>%
  spread(key1, val) %>%
  select(-key2)
#   id var a b
#1   1   a 3 2
#2   1   a 8 1
#3   1   a 7 1
#4   2   d 5 4
#5   2   d 1 6
#6   2   d 7 1
#7   3   g 1 1
#8   3   g 2 4
#9   3   g 2 4
#10  4   f 2 2
#11  4   f 5 7
#12  4   f 3 9
#13  5   i 2 3
#14  5   i 1 2
#15  5   i 1 6

a base R solution:

res <- do.call(rbind,lapply(1:3,function(x) setNames(df[c(1:2,2*x+(1:2))],names(df)[1:4])))
res[order(res$id),]
#    id var a1 b1
# 1   1   a  3  2
# 6   1   a  8  1
# 11  1   a  7  1
# 2   2   d  5  4
# 7   2   d  1  6
# 12  2   d  7  1
# 3   3   g  1  1
# 8   3   g  2  4
# 13  3   g  2  4
# 4   4   f  2  2
# 9   4   f  5  7
# 14  4   f  3  9
# 5   5   i  2  3
# 10  5   i  1  2
# 15  5   i  1  6
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!