I have a dataframe with >100 columns, and I would to find the unique rows, by comparing only two of the columns. I\'m hoping this is an easy one, but I can\'t get it working
Here are a couple dplyr options that keep non-duplicate rows based on columns id and id2:
dplyr
library(dplyr) df %>% distinct(id, id2, .keep_all = TRUE) df %>% group_by(id, id2) %>% filter(row_number() == 1) df %>% group_by(id, id2) %>% slice(1)