Aggregating all unique values of each column of data frame

前端 未结 2 1428
夕颜
夕颜 2020-12-10 16:44

I have a large data frame (1616610 rows, 255 columns) and I need to paste together the unique values of each column based on a key.

For example:

>         


        
相关标签:
2条回答
  • 2020-12-10 16:58

    You could do the following with dplyr

    func_paste <- function(x) paste(unique(x), collapse = ', ')
    data %>%
        group_by(a) %>%
        summarise_each(funs(func_paste))
    
    ##      a               b      c                  d
    ##  (dbl)           (chr)  (chr)              (chr)
    ##1     1 apples, oranges 12, 22             Monday
    ##2     2          apples 45, 67 Tuesday, Wednesday
    ##3     3      grapefruit     28            Tuesday
    
    0 讨论(0)
  • 2020-12-10 17:17

    Moved from comments:

    library(data.table)
    
    dt <- as.data.table(data)
    dt[, lapply(.SD, function(x) toString(unique(x))), by = a]
    

    giving:

       a               b      c                  d
    1: 1 apples, oranges 12, 22             Monday
    2: 2          apples 45, 67 Tuesday, Wednesday
    3: 3      grapefruit     28            Tuesday
    
    0 讨论(0)
提交回复
热议问题