tidyr use separate_rows over multiple columns

前端 未结 2 1647
隐瞒了意图╮
隐瞒了意图╮ 2021-01-04 10:44

I have a data.frame where some cells contain strings of comma separate values:

d <- data.frame(a=c(1:3), 
       b=c(\"name1, name2, name3\", \"name4\",          


        
2条回答
  •  予麋鹿
    予麋鹿 (楼主)
    2021-01-04 11:29

    You can use a pipe. Note that sep = ", " is automatically detected.

    d %>% separate_rows(b) %>% separate_rows(c)
    #   a     b      c
    # 1 1 name1  name7
    # 2 1 name2  name7
    # 3 1 name3  name7
    # 4 2 name4  name8
    # 5 2 name4  name9
    # 6 3 name5 name10
    # 7 3 name6 name10
    

    Note: Using tidyr version 0.6.0, where the %>% operator is included in the package.


    Update: Using @doscendodiscimus comment, we could use a for() loop and reassign d in each iteration. This way we can have as many columns as we like. We will use a character vector of column names, so we'll need to switch to the standard evaluation version, separate_rows_.

    cols <- c("b", "c")
    for(col in cols) {
        d <- separate_rows_(d, col)
    }
    

    which gives the updated d

      a     b      c
    1 1 name1  name7
    2 1 name2  name7
    3 1 name3  name7
    4 2 name4  name8
    5 2 name4  name9
    6 3 name5 name10
    7 3 name6 name10
    

提交回复
热议问题