I have a data frame in R which looks like:
| RIC | Date | Open |
|--------|---------------------|--------|
| S1A.PA | 2011-06-30 20:00:00
Easy way to get the information you want is to use dplyr
.
yourDF %>%
group_by(RIC, Date) %>%
mutate(num_dups = n(),
dup_id = row_number()) %>%
ungroup() %>%
mutate(is_duplicated = dup_id > 1)
Using this:
num_dups
tells you how many times that particular combo is duplicateddup_id
tells you which duplicate number that particular row is (e.g. 1st, 2nd, or 3rd, etc)is_duplicated
gives you an easy condition you can filter on later to remove all the duplicate rows (e.g. filter(!is_duplicated)
), though you could also use dup_id
for this (e.g. filter(dup_id == 1)
)