Extract elements common in all column groups

前端 未结 2 1766
悲哀的现实
悲哀的现实 2020-11-28 16:21

I have a R dataset x as below:

  ID Month
1   1   Jan
2   3   Jan
3   4   Jan
4   6   Jan
5   6   Jan
6   9   Jan
7   2   Feb
8   4   Feb
9   6   Feb
10  8           


        
2条回答
  •  野趣味
    野趣味 (楼主)
    2020-11-28 16:57

    We can use data.table. Convert the 'data.frame' to 'data.table' (setDT(df1)), grouped by 'ID', get the row index (.I) where the number of unique 'Months' are equal to the number of unique 'Months' in the whole dataset and subset the data based on this

    library(data.table)
    setDT(df1)[df1[, .I[uniqueN(Month) == uniqueN(df1$Month)], ID]$V1]
    #    ID Month
    # 1:  4   Jan
    # 2:  4   Feb
    # 3:  4   Mar
    # 4:  4   Apr
    # 5:  4   May
    # 6:  4   Jun
    # 7:  6   Jan
    # 8:  6   Jan
    # 9:  6   Feb
    #10:  6   Mar
    #11:  6   Apr
    #12:  6   May
    #13:  6   Jun
    

    To extract the 'ID's

    setDT(df1)[, ID[uniqueN(Month) == uniqueN(df1$Month)], ID]$V1
    #[1] 4 6
    

    Or with base R

    1) Using table with rowSums

    v1 <- rowSums(table(df1) > 0)
    names(v1)[v1==max(v1)]
    #[1] "4" "6"
    

    This info can be used for subsetting the data

    subset(df1, ID %in% names(v1)[v1 == max(v1)])
    

    2) Using tapply

    lst <- with(df1, tapply(Month, ID, FUN = unique))
    names(which(lengths(lst) == length(unique(df1$Month))))
    #[1] "4" "6"
    

    Or using dplyr

    library(dplyr)
    df1 %>%
         group_by(ID) %>%
         filter(n_distinct(Month)== n_distinct(df1$Month)) %>%
         .$ID %>%
         unique
    #[1] 4 6
    

    or if we need to get the rows

    df1 %>%
         group_by(ID) %>%
         filter(n_distinct(Month)== n_distinct(df1$Month))
    # A tibble: 13 x 2
    # Groups:   ID [2]
    #      ID Month
    #    
    # 1     4   Jan
    # 2     6   Jan
    # 3     6   Jan
    # 4     4   Feb
    # 5     6   Feb
    # 6     4   Mar
    # 7     6   Mar
    # 8     4   Apr
    # 9     6   Apr
    #10     4   May
    #11     6   May
    #12     4   Jun
    #13     6   Jun
    

提交回复
热议问题