Find duplicated elements with dplyr

后端 未结 5 1343
花落未央
花落未央 2020-12-04 11:06

I tried using the code presented here to find ALL duplicated elements with dplyr like this:

library(dplyr)

mtcars %>%
mutate(cyl.dup = cyl[duplicated(cyl         


        
5条回答
  •  既然无缘
    2020-12-04 11:48

    The original post contains an error in using the solution from the related answer. In the example given, when you use that solution inside mutate, it tries to subset the cyl vector which will not be of the same length as the mtcars dataframe.

    Instead you can use the following example with filter returning all duplicated elements or mutate with ifelse to create a dummy variable which can be filtered upon later:

     library(dplyr)
    
     # Return all duplicated elements
     mtcars %>%
       filter(duplicated(cyl) | duplicated(cyl, fromLast = TRUE))
     # Or for making dummy variable of all duplicated
     mtcars %>%
       mutate(cyl.dup =ifelse(duplicated(cyl) | duplicated(cyl, fromLast = TRUE), 1,0))
    

提交回复
热议问题