Find duplicated elements with dplyr

后端 未结 5 1344
花落未央
花落未央 2020-12-04 11:06

I tried using the code presented here to find ALL duplicated elements with dplyr like this:

library(dplyr)

mtcars %>%
mutate(cyl.dup = cyl[duplicated(cyl         


        
5条回答
  •  渐次进展
    2020-12-04 11:33

    I guess you could use filter for this purpose:

    mtcars %>% 
      group_by(carb) %>% 
      filter(n()>1)
    

    Small example (note that I added summarize() to prove that the resulting data set does not contain rows with duplicate 'carb'. I used 'carb' instead of 'cyl' because 'carb' has unique values whereas 'cyl' does not):

    mtcars %>% group_by(carb) %>% summarize(n=n())
    #Source: local data frame [6 x 2]
    #
    #  carb  n
    #1    1  7
    #2    2 10
    #3    3  3
    #4    4 10
    #5    6  1
    #6    8  1
    
    mtcars %>% group_by(carb) %>% filter(n()>1) %>% summarize(n=n())
    #Source: local data frame [4 x 2]
    #
    #  carb  n
    #1    1  7
    #2    2 10
    #3    3  3
    #4    4 10
    

提交回复
热议问题