Finding ALL duplicate rows, including “elements with smaller subscripts”

后端 未结 7 810
借酒劲吻你
借酒劲吻你 2020-11-21 07:55

R\'s duplicated returns a vector showing whether each element of a vector or data frame is a duplicate of an element with a smaller subscript. So if rows 3, 4,

7条回答
  •  萌比男神i
    2020-11-21 08:36

    Duplicated rows in a dataframe could be obtained with dplyr by doing

    df = bind_rows(iris, head(iris, 20)) # build some test data
    df %>% group_by_all() %>% filter(n()>1) %>% ungroup()
    

    To exclude certain columns group_by_at(vars(-var1, -var2)) could be used instead to group the data.

    If the row indices and not just the data is actually needed, you could add them first as in:

    df %>% add_rownames %>% group_by_at(vars(-rowname)) %>% filter(n()>1) %>% pull(rowname)
    

提交回复
热议问题