I have a data frame in R which looks like:
| RIC | Date | Open |
|--------|---------------------|--------|
| S1A.PA | 2011-06-30 20:00:00
Here's a dplyr
option for tagging duplicates based on two (or more) columns. In this case ric
and date
:
df <- data_frame(ric = c('S1A.PA', 'ABC.PA', 'EFG.PA', 'S1A.PA', 'ABC.PA', 'EFG.PA'),
date = c('2011-06-30 20:00:00', '2011-07-03 20:00:00', '2011-07-04 20:00:00', '2011-07-05 20:00:00', '2011-07-03 20:00:00', '2011-07-04 20:00:00'),
open = c(23.7, 24.31, 24.495, 24.23, 24.31, 24.495))
df %>%
group_by(ric, date) %>%
mutate(dupe = n()>1)
# A tibble: 6 x 4
# Groups: ric, date [4]
ric date open dupe
1 S1A.PA 2011-06-30 20:00:00 23.7 FALSE
2 ABC.PA 2011-07-03 20:00:00 24.3 TRUE
3 EFG.PA 2011-07-04 20:00:00 24.5 TRUE
4 S1A.PA 2011-07-05 20:00:00 24.2 FALSE
5 ABC.PA 2011-07-03 20:00:00 24.3 TRUE
6 EFG.PA 2011-07-04 20:00:00 24.5 TRUE